2013-08-21

Puppet: Exiting; no certificate found and waitforcert is disabled

I have a number of servers that were built using puppet. They contact a central puppet master and pull configs. This had been working quite well for a while. The I noticed that they suddenly have been silently failing to do any updates. I then tried this manually:
# puppet agent --test
Exiting; no certificate found and waitforcert is disabled
Well, that's not too useful. Other puppet slaves are running, and the puppet master doesn't have a full disk or anything. Then I noticed the following:
ls -al /var/lib/puppet/ssl/certificate_requests/
-rw-r----- 1 puppet puppet 1610 Jan 17  2013 hostname.example.net.pem
Weird, why was there a request for this? Not sure. But doing a quick rm of that file and then re-running "puppet agent --test" made puppet create a new cert and submit it to the master. I then ran "puppet cert --sign --all" and it's good to go! So, not sure about the root cause yet, but this solution helped me out and I wanted to share.

2013-05-20

MySQL 5.6.x Admin Password

So, I'm rolling out a fresh mysql server the other day, as I've done many times before, and ran into some odd behaviour. I downloaded a fresh set of RPM's from oracle's download page and did my usual:
# yum install MySQL-*-5.6.11*.rpm -y
# mysql_install_db
# mysql_secure_installation
And then I got the root password prompt. Now usually you just hit enter because the root password is blank. But it wasn't taking that today. Wiped the DB, erased and reinstalled, still didn't take the empty password. Until I did:
# ls -al
total 366900
dr-xr-x---.  4 root root      4096 May 20 12:08 .
dr-xr-xr-x. 24 root root      4096 May 20 11:26 ..
-rw-r--r--.  1 root root  23010735 May 20 11:07 MySQL-client-5.6.11-2.linux_glibc2.5.x86_64.rpm
-rw-r--r--.  1 root root   4554269 May 20 11:07 MySQL-devel-5.6.11-2.linux_glibc2.5.x86_64.rpm
-rw-r--r--.  1 root root 112519557 May 20 11:08 MySQL-embedded-5.6.11-2.linux_glibc2.5.x86_64.rpm
-rw-------.  1 root root       192 May 20 12:00 .mysql_secret
-rw-r--r--.  1 root root  56354288 May 20 11:09 MySQL-server-5.6.11-2.el6.x86_64.rpm
-rw-r--r--.  1 root root  88319899 May 20 11:10 MySQL-server-5.6.11-2.linux_glibc2.5.x86_64.rpm
-rw-r--r--.  1 root root   2389748 May 20 11:10 MySQL-shared-5.6.11-2.linux_glibc2.5.x86_64.rpm
-rw-r--r--.  1 root root   5180812 May 20 11:10 MySQL-shared-compat-5.6.11-2.linux_glibc2.5.x86_64.rpm
-rw-r--r--.  1 root root  72675691 May 20 11:11 MySQL-test-5.6.11-2.linux_glibc2.5.x86_64.rpm
#
Wait, what's that ".mysql_secret" file?! I didn't put that there. Apparently it did... Turns out, upon initial install of the MySQL Packages, it runs an the mysql_install_db automatically... and then it sets the password to something random.
# cat .mysql_secret

# The random password set for the root user at Mon May 20 12:00:16 2013 (local time): mS2tzW4Z
So, you'd think that you can still run mysql_secure_installation, but you can use that password. Except that you end up getting the message "ERROR 1862 (HY000): Your password has expired. To log in you must change it using a client that supports expired passwords." Which is apparently the normal text client. So, feel free to reset your password the old fashion way:
# mysql -u root -pmS2tzW4Z

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.6.11

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SET PASSWORD FOR 'root'@'localhost' = PASSWORD('ub3rS3cureP455word!');
Query OK, 0 rows affected (0.00 sec)

mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)

Then you can run mysql_secure_installation and enjoy your new MySQL Install.

2013-02-09

Nginx + FastCGI with the Quickness

CentOS 6.x (I've got 6.3 here) install. first off, you'll need both nginx and spawn-fcgi as well as php. For purposes of simplicity, I'll just go with the 5.3.3 that yum pulls in, but really any version (I ran it with php 5.4.8 for this example) will work. As long as it's compiled with the -cgi flags. and you have 'php-cgi' available as that's what spawn-fcgi executes.
yum install nginx spawn-fcgi php -y
This will install both of them with their default config's. You'll need to tweak a few things. First off, let's tackle /etc/sysconfig/spawn-fcgi

Spawn-fcgi Config

SOCKET=/var/run/php-fcgi.sock
OPTIONS="-u nginx -g nginx -s $SOCKET -S -M 0600 -C 8 -F 1 -P /var/run/spawn-fcgi.pid -- /usr/bin/php-cgi"
By default this ships with -C 32, which means it'll start 32 php-cgi processes. This seems like a lot in my experience. We have some very busy image servers and they do well with 4 to 8. I usually go with the "# cores + 2" idea and it's worked well for me so far. Any way, you'll also want to make sure you remember where that 'Socket' is defined. It doesn't really matter where it is, but it matters that you remember it!

Nginx Config

server {
  listen  80;
  server_name zabbix.example.com;
  root   /var/www/zabbix;
 
  location / {
   index  index.html index.htm index.php;
  }
 
  location ~ \.php$ {
   include /etc/nginx/fastcgi.conf;
   fastcgi_pass unix:/var/run/php-fcgi.sock;
   fastcgi_index index.php;
  }
 }
This server block will go in either your main nginx.conf file (/etc/nginx/nginx.conf on CentOS), or in a file included from that one. This will define a vhost listening on that "server_name", hosted in that "root". It will use the info in /etc/nginx/fastcgi.conf and pass that info over to your socket defined above (I told you to remember that!). Basically, the second location block tells nginx that any file ending with .php should use the fast-cgi and php socket to run. ... and that's it! I highly recommend trolling through the php.ini options as well as any other options in nginx to make sure there aren't any red-flags flying (I know I've tweaked a lot outside of this) but this should get you serving php!

2013-02-05

Puppet Error: header too long

If you're working with Puppet and you find that you get this error:
puppet cert --list
Error: header too long
Be mindful of your free space! I've now rolled out 20 servers or so in my puppet setup (soon to be duplicated to over 142 servers once I get these running right. All I'll have to do is spin up a new server, give it an IP and hostname and tell it where the Puppet Master is and Puppet will handle the rest!), and I've found that I'm starting to easily fill up the drive with old reports. Especially when re-running puppet syncs more frequently than the normal 30 min run-interval. I started getting the above error with a lot of various puppet commands, the simplest one, just trying to list certs. Then I checked a "df -h":
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              16G   15G     0 100% /
Oops! Using the following script I was able to clean up old reports easily. Set the "days" variable to as high as you want for your setup. I'm using Puppet Dashboard to pull in reports to a DB, so I don't need to keep the yaml's around too long.
#!/bin/sh
days="+1"       # more than a day old

for d in `find /var/lib/puppet/reports -mindepth 1 -maxdepth 1 -type d`
do
        find $d -type f -name \*.yaml -mtime $days |
        sort -r |
        tail -n +2 |
        xargs /bin/rm -f
done
In my case, since it tried to sync a new server ssl cert while the drive was full, the error came out to be due to not only the free space, but a corrupt cert. To find the offending cert and fix the issue, you'll need to look through the /var/lib/puppet dir for the file. The host I was looking for is 'betamem.example.com' and I found it like this:
# cd /var/lib/puppet
# find ./|grep betamem
./ssl/ca/requests/betamem.example.com
I then removed the cert (held in /var/lib/puppet/ssl/certificate_requests/) from the agent on 'betamem' and told it to try again by cycling it's puppet agent.
# rm -f /var/lib/puppet/ssl/certificate_requests/*
# /etc/init.d/puppet restart
Stopping puppet agent:                                     [  OK  ]
Starting puppet agent:                                     [  OK  ]
Tailing /var/log/messages on the master shows it's got a new request, so let's sign it:
# tail /var/log/messages -n1
puppet-master[22486]: betamem.example.com has a waiting certificate request
# puppet cert --sign betamem.example.com
Signed certificate request for betamem.example.com
Removing file Puppet::SSL::CertificateRequest at '/var/lib/puppet/ssl/ca/requests/betamem.example.com.pem'
Go back to the puppet agent and cycle it again, or just wait until the next run-interval and it should be back to normal!

2012-12-18

Error Performing Checksum

Getting something like this when running an update from your repo?
MyRepo/primary http://repo.example.com/repo/CentOS/5.8/x86_64/repodata/primary.xml.gz: [Errno -3] Error performing checksum
Well, it seems CentOS 5.x requires sha, not sha256. So when you run createrepo, make sure to change it to 'createrepo --checksum=sum' instead.

2012-06-26

Trac: Fixing the dreaded "Duplicate entry" Error during SVN Rescan

So, The other day I was minding my own business, resyncing an SVN repository in trac and I ran across this error:
File "/usr/bin/trac-admin", line 7, in ? sys.exit(
  File "/usr/lib/python2.4/site-packages/Trac-0.11.6-py2.4.egg/trac/admin/console.py", line 1325, in run admin.run()
  File "/usr/lib/python2.4/site-packages/Trac-0.11.6-py2.4.egg/trac/admin/console.py", line 155, in run self.cmdloop()
  File "/usr/lib64/python2.4/cmd.py", line 142, in cmdloop stop = self.onecmd(line)
  File "/usr/lib/python2.4/site-packages/Trac-0.11.6-py2.4.egg/trac/admin/console.py", line 138, in onecmd rv = cmd.Cmd.onecmd(self, line) or 0
  File "/usr/lib64/python2.4/cmd.py", line 219, in onecmd return func(arg)
  File "/usr/lib/python2.4/site-packages/Trac-0.11.6-py2.4.egg/trac/admin/console.py", line 680, in do_resync repos = env.get_repository().sync(self._resync_feedback)
  File "/usr/lib/python2.4/site-packages/Trac-0.11.6-py2.4.egg/trac/versioncontrol/cache.py", line 214, in sync (str(next_youngest),
  File "/usr/lib/python2.4/site-packages/Trac-0.11.6-py2.4.egg/trac/db/util.py", line 64, in execute return self.cursor.execute(sql_escape_percent(sql), args)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 163, in execute    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler raise errorclass, errorvalue
_mysql_exceptions.IntegrityError: (1062, "Duplicate entry '37295-releases/may0712-D' for key 'PRIMARY'")

How did we get here?!

As background, this is a copy of a repository we've been working with for a while. We're copying it to a new repository as we have two sets of files in it that were originally connected projects but have grown farther and farther apart to the point where they don't really reference each other or need to be together. At one point we had a few people who somehow checked in duplicate deletions of the same file in the original repository. (How that got past SVN is another story, but it wasn't anything out of the ordinary). In any case, this isn't the first time I've run across this error and since we made a copy of the repo that had that issue, we now have it here.

Now, if this was the very last revision and the repo was quiescent, you could do a little clever hacking on the svn repo to remove the duplication... but that's not possible to do here. No, instead we're going to get MySQL Crafty.

Time for Mysql Sluthing

First off, I open up my trusty commandline mysql client and pick my database. Looking at the list of tables you'll see the "revision" and "node_change" tables which look like this:
mysql> describe revision;
+---------+---------+------+-----+---------+-------+
| Field   | Type    | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| rev     | text    | NO   | PRI | NULL    |       |
| time    | int(11) | YES  | MUL | NULL    |       |
| author  | text    | YES  |     | NULL    |       |
| message | text    | YES  |     | NULL    |       |
+---------+---------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> describe node_change;
+-------------+------+------+-----+---------+-------+
| Field       | Type | Null | Key | Default | Extra |
+-------------+------+------+-----+---------+-------+
| rev         | text | NO   | PRI | NULL    |       |
| path        | text | NO   | PRI | NULL    |       |
| node_type   | text | YES  |     | NULL    |       |
| change_type | text | NO   | PRI | NULL    |       |
| base_path   | text | YES  |     | NULL    |       |
| base_rev    | text | YES  |     | NULL    |       |
+-------------+------+------+-----+---------+-------+

A Plan Comes Together

'rev' is a Primary Key (or part of one) in both cases and, according to the message above, we're trying to insert the same primary key twice. So how are we going to fix this? If I delete the offending records, it'll rescan the repo up to that point, see that it's not there, add it, then add the second one. So, that doesn't help us at all. We need to leave it there but not let it complain the next time. If we leave it there but drop the unique requirement, it'll write it for the first rev, and then write over it again with the second rev, or (if it doesn't look for an existing record first) it'll just add the line twice. In the latter case, we can clean that up later and then re-institute the unique requirement. Sounds like a win, let's do it.

Dropping Constraints

mysql> alter table revision drop primary key;
Query OK, 37295 rows affected (0.11 sec)
Records: 37295  Duplicates: 0  Warnings: 0

mysql> alter table node_change drop primary key;
Query OK, 328916 rows affected (3.13 sec)
Records: 328916  Duplicates: 0  Warnings: 0


mysql> describe revision;
+---------+---------+------+-----+---------+-------+
| Field   | Type    | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| rev     | text    | NO   |     | NULL    |       |
| time    | int(11) | YES  | MUL | NULL    |       |
| author  | text    | YES  |     | NULL    |       |
| message | text    | YES  |     | NULL    |       |
+---------+---------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> describe node_change;
+-------------+------+------+-----+---------+-------+
| Field       | Type | Null | Key | Default | Extra |
+-------------+------+------+-----+---------+-------+
| rev         | text | NO   | MUL | NULL    |       |
| path        | text | NO   |     | NULL    |       |
| node_type   | text | YES  |     | NULL    |       |
| change_type | text | NO   |     | NULL    |       |
| base_path   | text | YES  |     | NULL    |       |
| base_rev    | text | YES  |     | NULL    |       |
+-------------+------+------+-----+---------+-------+
6 rows in set (0.00 sec)
Alright! Let's resync again... just running "tracadmin resync"... And watching it count... for a while. Eventually we'll see something like:
37309 revisions cached.
Done.

Searching for UFO's

So, now we've got all that data in the db... including the duplicate revision. We can't just add back the keys because it'll fail the constraint validation. (If you run "alter table node_change add primary key (rev(16), path(512), change_type(1));" you'll get "ERROR 1062 (23000): Duplicate entry '32152-branches/...' for key 'PRIMARY'".We're going to have to clear that extra one out first, so let's find it.
mysql> select count(*) as c,rev,path,change_type from node_change group by concat(rev, path, change_type) order by c desc limit 10;
+---+-------+---------------------------------+-------------+
| c | rev   | path                            | change_type |
+---+-------+---------------------------------+-------------+
| 2 | 37295 | releases/May0712                | D           |
| 1 | 33277 | releases/S2611/myApp/main.cpp   | E           |
| 1 | 33277 | releases/S2611/myApp            | E           |
...
Well, there she is, 'c'=2, so we've got two of the same thing, just like we thought. I put a limit of 10 instead of only pulling the top result so we could make sure that there were no other problem entries pulled in after that one during the sync. If there were, we'd have to do the following steps for each.

Dropping the Dupes

mysql> select * from node_change where rev=37295 and path='releases/May0712';
+-------+------------------+-----------+-------------+------------------+----------+
| rev   | path             | node_type | change_type | base_path        | base_rev |
+-------+------------------+-----------+-------------+------------------+----------+
| 37295 | releases/May0712 | D         | D           | releases/May0712 | 37294    |
| 37295 | releases/may0712 | D         | D           | releases/may0712 | 37294    |
+-------+------------------+-----------+-------------+------------------+----------+
2 rows in set (0.20 sec)

mysql> delete from node_change where rev=37295 and path='releases/may0712' limit 1;
Query OK, 1 row affected (0.19 sec)

mysql> select * from node_change where rev=37295 and path='releases/may0712';
+-------+------------------+-----------+-------------+------------------+----------+
| rev   | path             | node_type | change_type | base_path        | base_rev |
+-------+------------------+-----------+-------------+------------------+----------+
| 37295 | releases/may0712 | D         | D           | releases/may0712 | 37294    |
+-------+------------------+-----------+-------------+------------------+----------+
1 row in set (0.24 sec)
You can see that MySQL honors the 'limit' clause on deletes, saving us a little work here. Now we should be able to add our constraint back on.
mysql> alter table node_change add primary key (rev(16), path(512), change_type(1));
Query OK, 329329 rows affected (6.94 sec)
Records: 329329  Duplicates: 0  Warnings: 0

mysql> alter table revision add primary key (rev(16));
Query OK, 37309 rows affected (0.29 sec)
Records: 37309  Duplicates: 0  Warnings: 0
And we're done!

2012-03-28

PHP 5.4 Install on CentOS 5.x

So, PHP 5.4 is out now, and I figured it time I upgraded from the PHP that comes with CentOS 5.x (PHP 5.1.x!) to something modern. After all, 5.3+ gives you lots of benefits and years of bug killing and hole plugging. I pulled down the php 5.4 tarball from php.net and got to work compiling. Starting with what I had from a php -i, and fixing things in the configure line I ended up using:

./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu --program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/usr/com --mandir=/usr/share/man --infodir=/usr/share/info --cache-file=../config.cache --with-libdir=lib64 --with-config-file-path=/etc --with-config-file-scan-dir=/etc/php.d --disable-debug --with-pic --disable-rpath --without-pear --with-bz2 --with-curl --with-freetype-dir=/usr --with-png-dir=/usr --enable-gd-native-ttf --without-gdbm --with-gettext --with-gmp --with-iconv --with-jpeg-dir=/usr --with-openssl --with-pspell --with-pcre-regex=/usr --with-zlib --with-layout=GNU --enable-exif --enable-ftp --enable-sockets --enable-sysvsem --enable-sysvshm --enable-sysvmsg --enable-trans-sid --enable-wddx --enable-shmop --enable-calendar --with-libxml-dir=/usr --enable-pcntl --with-imap=shared --with-imap-ssl --enable-mbstring=shared --enable-mbregex --with-ncurses=shared --with-gd=shared --enable-bcmath=shared --enable-dba=shared --with-xmlrpc=shared --with-mysql=/usr --with-mysqli=/usr/bin/mysql_config --enable-dom=shared --with-snmp=shared,/usr --enable-soap=shared --with-xsl=shared,/usr --enable-xmlreader=shared --enable-xmlwriter=shared --with-pdo-mysql=/usr/bin/mysql_config --with-kerberos --with-sapi --with-apxs2


This got me everything I needed, except that I realized that I then lost memcache. I did a bit of searching and I couldn't yet find a memcache extension for 5.4 (this will probably be fixed in the future). I finally found this site http://www.sohailriaz.com/how-to-install-memcached-with-memcache-php-extension-on-centos-5x/ which went into detail on how to install memcached from scratch which I didn't need to do... but it had one section that I'd like to echo here. How to use pecl and phpize to get yourself a shiny new memcache.so file for 5.4. Basically run:
wget http://pecl.php.net/get/memcache-2.2.5.tgz

then untar it and go into the directory. Simply running:

phpize
./configure
make && make install

Will create a configure script, configure the extension, make it (make sure you have gcc, but I assume if you compiled php that you do), and copy it all to where it needs to go! You may have to change where you're memcache.ini file points to for the extension or something similar, but this should get you back in business!