2016-12-21

Keeping a process running with Flock and Cron

We've got a few processes here that aren't system services, but need to be running in the background and should be restarted if they die. However, this method can also be used for a cron that often runs past it's normal re-execution time (say a every 5 min cron that sometimes runs for 7min). It will prevent multiple executions from running simultaneously.

First off, in your crontab, you can add a line like this:

* * * * * flock -x -n /tmp/awesomenessRunning.lock -c "/usr/local/bin/myAwesomeScript.sh" >/dev/null 2>&1

What happens here is fairly straight forward:
  • Every minute, flock executes your script in this case "/usr/local/bin/myAwesomeScript.sh"
  • flock opens up an exclusive write lock on the lock file, here named "/tmp/awesomenessRunning.lock". When it's done executing, it'll release the lock.
  • The next time this cron runs, flock will attempt to again, get an exclusive lock on that lock file... but it can't if that script is still running, so it'll give up and try again next time the cron runs.

Now, generally, if I'm doing this as a systems level item, I'll put the following in a file named for the job or what the job is doing and drop it in /etc/cron.d/. All the files there will get compiled together into the system cron, which helps other admins (or your later self) to find and disable it later. If you do that, remember to stick the user to execute the cron as between the *'s and the flock!

2016-06-24

Growing a RAID-5 Mdadm while online

Today, I decided that my 3 drive RAID-5 setup just wasn't big enough. I've got 2 1.5TB and a 1TB drive in a RAID-5 of the first 1TB across them all using madam. The extra space on the 1.5's lets me use that area as scratch disks for other things that I don't especially need the speed or resilience of the RAID-5 for. But now it's time to throw another TB at it and make it a 4 drive RAID-5. Mdadm can help us out here in just a few steps. First, though, we have to partition the new drive so we can use it. We also need to know which one it is... First I'll see what we had. I know it's mounted on /array/ and can tell from a mount output what the device name is, and extrapolate what drives are part of it like this:

# mount |grep array
/dev/md3 on /array type ext4 (rw,noatime,data=ordered)

# mdadm --misc --detail /dev/sd3
/dev/md3:
        Version : 0.91
  Creation Time : Sun Jul 17 21:20:35 2011
     Raid Level : raid5
...
    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
       2       8       97        2      active sync   /dev/sdg1

So, sdc1, sdd1, sdg1 are all part of this array. After inserting the new disk, I run a `dmesg|grep TB` as I know it will be listed as a #TB drive and we'll look for other devices:
# dmesg |grep TB
[    3.264540] sd 2:0:0:0: [sdc] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
[    3.329286] sd 2:0:1:0: [sdd] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
[    3.329630] sd 3:0:0:0: [sde] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    5.930020] sd 9:0:0:0: [sdg] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)

Hey look, a new one, and we shall call you 'sde'. Time to make some partitions:

# fdisk /dev/sde

Welcome to fdisk (util-linux 2.26.2).
Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table. Created a new DOS disklabel with disk identifier 0x8c82f3b1. Command (m for help): p Disk /dev/sde: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x8c82f3b1 Command (m for help): n Partition type p primary (0 primary, 0 extended, 4 free) e extended (container for logical partitions) Select (default p): p Partition number (1-4, default 1): 1 First sector (2048-1953525167, default 2048): 2048 Last sector, +sectors or +size{K,M,G,T,P} (2048-1953525167, default 1953525167): 1953525167 Created a new partition 1 of type 'Linux' and of size 931.5 GiB. Command (m for help): w The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks.
So, above, we hit 'p' to print what was there. All it did was tell us about the disk again because there weren't any existing partitions... This is still a good check to make sure that there's nothing on the drive already and that it's really the disk you want. If this isn't a new drive, and you want to clear it out first, you'll need to simply hit 'd' and pick the number corresponding to what you want to delete until there aren't any left. We then just created a default partition as the first and only one taking up the whole drive.

If you wanted to do as I did with the 1.5TB drives and create an array that doesn't take up the whole drive (like say I got a 2TB one, but wanted the first 1TB in this array), create a partition as normal, but change that "Last sector" number to be the same number as one of the other drives partitions. Running the fdisk /dev/<otherDevice> and hitting 'p' to print it's table and then 'q' to quit, will let you find out the sector count of that drive which you can just match here. Feel free to make extra partitions then for whatever else you want to do with the remainder of the disk.

# mdadm --grow --raid-devices=4 --add /dev/md3 /dev/sde1
mdadm: added /dev/sde1
mdadm: Need to backup 192K of critical section..

# top
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
  205 root      20   0       0      0      0 S  29.2  0.0   0:16.08 md3_raid5                                                                           
 4668 root      20   0       0      0      0 D  11.0  0.0   0:04.95 md3_resync    
...
You can see that mdadm is using up some decent CPU now (this is a Quad Core 2Ghz Core 2 based Pentium D), crunching all those raid-5 checksums.

# iostat -m 1 100
 
Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               0.00         0.00         0.00          0          0
sdb               0.00         0.00         0.00          0          0
sdc             162.00        46.23        32.00         46         32
sdd             164.00        47.38        32.00         47         32
sde              91.00         0.00        29.83          0         29
md3               0.00         0.00         0.00          0          0
sdg             151.00        44.73        28.00         44         28

And iostat shows that sdc, sdd, sdg and the new sde are all moving lots of MB/sec. Interestingly, since sde is new, you can tell it's not being read from, only written to.

If you want to see detailed progress, you can run this:

# watch cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10] 
md128 : active raid1 sdb1[1] sda1[0]
      112972800 blocks super 1.2 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md3 : active raid5 sde1[3] sdg1[2] sdd1[1] sdc1[0]
      1953519872 blocks super 0.91 level 5, 32k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  3.7% (36999992/976759936) finish=504.7min speed=31026K/sec

Here you can see both 'md128' RAID-1 my boot drive mirror (sda1 and sdb1), as well as the now expanding RAID-5 'md3' using sdg1, sdd1, sdc1, and of course, the new sde1. Because that's run via 'watch' it'll update every 2 seconds by default. Taking the 'watch' off the front will give you a 1 time status page.

And now we wait... about 504.7min, apparently. ... Finally, you'll see:

# top
# cat /proc/mdstat
...
md3 : active raid5 sde1[3] sdg1[2] sdd1[1] sdc1[0]
      2930279808 blocks level 5, 32k chunk, algorithm 2 [4/4] [UUUU]
...
# dmesg | tail
...
[42646.351875] md: md3: reshape done.
[42648.156073] RAID conf printout:
[42648.156078]  --- level:5 rd:4 wd:4
[42648.156081]  disk 0, o:1, dev:sdc1
[42648.156084]  disk 1, o:1, dev:sdd1
[42648.156086]  disk 2, o:1, dev:sdg1
[42648.156088]  disk 3, o:1, dev:sde1
[42648.156094] md3: detected capacity change from 2000404348928 to 3000606523392
[42649.508764] VFS: busy inodes on changed media or resized disk md3

But our file system according to df still shows what it was. Well, the filesystem and it's allocation table were written while it was smaller, so, if it's formatted with any of the ext filesystem types, it can be enlarged that with the following commands.

# resize2fs /dev/md3
resize2fs 1.42.12 (29-Aug-2014)
Filesystem at /dev/md3 is mounted on /array; on-line resizing required
old_desc_blocks = 117, new_desc_blocks = 175
The filesystem on /dev/md3 is now 732569952 (4k) blocks long.
# dmesg|tail
...
[52020.706909] EXT4-fs (md3): resizing filesystem from 488379968 to 732569952 blocks
[52023.727545] EXT4-fs (md3): resized filesystem to 732569952

Huzzah! We have our space! Running a quick df will also show the capacity increase! It's online and ready to use. Hope this helped you, and thanks for reading!

2016-05-24

Expanding a non-LVM disk on Linux

The example below was done in Ubuntu 14.04LTS, but it really is about the same in any 'modern' linux distribution.

You can see below that our /storage mount is full. Time to add some more storage. Now, there's two options here. Luckily this is a virtual machine so I can just tell VMWare I want to make that disk bigger, reboot or rescan the disk and it'll pick it up... but it won't resize the partition. If this is a physical box, however, there could still be a simple solution, if this disk has multiple partitions on it, you can consume the next one on the disk to create one big partition and likewise solve this issue.  If you're completely out of space on that disk, you'll need some more magic to make it happen, and that won't be covered here.

root@web01.example.com:~# df
Filesystem     1K-blocks      Used Available Use% Mounted on
dev              1012984        12   1012972   1% /dev
tmpfs             204896       756    204140   1% /run
/dev/dm-0      255583756   9385312 233192416   4% /
none                   4         0         4   0% /sys/fs/cgroup
none                5120         0      5120   0% /run/lock
none             1024468         0   1024468   0% /run/shm
none              102400         0    102400   0% /run/user
/dev/sda1         240972    104857    123674  46% /boot
/dev/sdb1      515929528 515912512         0 100% /storage

The steps you'll need to follow to expand the partition above, called '/dev/sdb1' is as follows:

  1. Unmount the disk with the standard 'umount /dev/sdb1' command.
  2. If you're consuming a partition, skip to step 4. If you're running this as a VM and can simply expand the disk, do so and reboot or rescan. (Different virtualization programs allow different options here. Some will allow online expansion, others will require a shutdown first)
  3. After booting back up, make sure the drive is not mounted and open fdisk on that drive. You should see your updated drive size.
  4. Using fdisk, you'll need to remember the partition number you're resizing, the type, and the starting position. If you're consuming the next partition and not going to the end of the disk, you'll need that number as well. 
  5. root@web01.example.com:~# fdisk /dev/sdb
    
    Command (m for help): p
    
    Disk /dev/sdb: 805.3 GB, 805306368000 bytes
    193 heads, 8 sectors/track, 1018694 cylinders, total 1572864000 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x4f10cdef
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sdb1            2048  1048575999   524286976   83  Linux
    
    Command (m for help): d
    Selected partition 1   
    
  6. Delete the partition. If you're consuming another, delete that as well. Now, you haven't deleted data, just pointers to where it starts.
  7. Create a new partition. Start where the old one started, end either at the end of the disk, or the end of the partition you're consuming and hit "w" to write the changes to disk.
  8. Command (m for help): n
    Partition type:   
       p   primary (0 primary, 0 extended, 4 free)
       e   extended
    Select (default p): p
    Partition number (1-4, default 1):
    Using default value 1
    First sector (2048-1572863999, default 2048):
    Using default value 2048
    Last sector, +sectors or +size{K,M,G} (2048-1572863999, default 1572863999):
    Using default value 1572863999
    
    Command (m for help): w
    The partition table has been altered!
    
    Calling ioctl() to re-read partition table.
    Syncing disks.
    
  9. Now the system knows about the partition but the file system inside it only knows about the old format. We have to resize the file system to fill the space. In order to resize, we must verify that it's in order and clean.
  10. root@web01.example.com:~# resize2fs /dev/sdb1
    resize2fs 1.42.9 (4-Feb-2014)
    Please run 'e2fsck -f /dev/sdb1' first.
    
  11. Finally, it's time to actually resize the drive
  12. root@web01.example.com:~# e2fsck -f /dev/sdb1
    e2fsck 1.42.9 (4-Feb-2014)
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    /dev/sdb1: 1369/32768000 files (0.6% non-contiguous), 131067490/131071744 blocks
    
  13. Now we can resize the file system and the mount it!
  14. root@web01.example.com:~# resize2fs /dev/sdb1
    resize2fs 1.42.9 (4-Feb-2014)
    Resizing the filesystem on /dev/sdb1 to 196607744 (4k) blocks.
    The filesystem on /dev/sdb1 is now 196607744 blocks long.
    root@web01.example.com:~# mount -a
    root@web01.example.com:~# df
    Filesystem     1K-blocks      Used Available Use% Mounted on
    udev             1012984        12   1012972   1% /dev
    tmpfs             204896       756    204140   1% /run
    /dev/dm-0      255583756   9385316 233192412   4% /
    none                   4         0         4   0% /sys/fs/cgroup
    none                5120         0      5120   0% /run/lock
    none             1024468         0   1024468   0% /run/shm
    none              102400         0    102400   0% /run/user
    /dev/sda1         240972    104857    123674  46% /boot
    /dev/sdb1      773960448 515911432 218711088  71% /storage
    
  15. And a quick 'df' shows that we've got some breathing room!
  16. Go grab a beer or glass of wine and pat yourself on the back.

2016-05-04

Using a Proxy with yum on CentOS/RHEL

If you follow this blog, you'll know that I wrote on how to add http proxy support to apt-get. What about CentOS? Gotta show love to the 'yum' runners out there. So here we go.

Of course, you can always run:

$ export http_proxy="http://username:password@proxy:port/"; $ export https_proxy="http://username:password@proxy:port/";

I've had some iffy results at times though, and it definitely doesn't work for cron processes or other users. It's not like the system is moving anytime soon, so let's just set this permanently in the yum configs.

$ sudo vi /etc/yum.conf

It will prompt you for your sudo password (if you're not already root). After that, you'll be editing the yum.conf file. There are numerous lines in here that are important, and you may want to look into tweaking settings, but that's not what this post is about. For the Proxy, you simply have to tack in the following line.

proxy="http://username:password@proxy.example.com:port/"

Doesn't matter where it is in the file, but you'll want to search the file real quick to make sure you're not duplicating the 'proxy' settings or you may not get the outcome you expect.

Save it, and make sure to run an 'yum update' to get the latest package lists and such. You should notice that it rolls right through them now that your system is able to talk out to the internet. Huzzah.

2016-05-03

Add Proxy to apt-get on Ubuntu

Many times for security and network topology reasons, I've had to deal with hosts being behind proxies. This is generally fairly easy to work with. Before you make a call out, you can run:

$ export http_proxy="http://username:password@proxy:port/"; $ export https_proxy="http://username:password@proxy:port/";

Then run your normal commands and MOST things will pay attention to the environmental variables. Heck, you can even put it in your .bashrc or whatever else you automatically load into your environment on login.

But that doesn't work for other people... or automated processes. So every Cron you have has to go through that process. Many of those are unique and have their own settings files to reflect the proxy, and apt is no different. Apt is, however, easy to configure to use a proxy:

$ sudo vi /etc/apt/apt.conf

It will prompt you for your sudo password (if you're not already root). After that, you'll be editing the apt.conf file and if yours is anything like mine, it's empty. If it isn't empty, make sure you're not duplicating the info we're putting in there. Then enter in the following config lines, substituting your own info in. If your proxy doesn't have a username/password, you can skip the italic 'username:password@' section.

Acquire::http::proxy "http://username:password@proxy.example.com:port/"; Acquire::https::proxy "https://username:password@proxy.example.com:port/";

Save it, and make sure to run an 'apt-get update' to get the latest package lists and such. You should notice that it rolls right through them now that your system is able to talk out to the internet.