Showing posts with label Expand. Show all posts
Showing posts with label Expand. Show all posts

2016-06-24

Growing a RAID-5 Mdadm while online

Today, I decided that my 3 drive RAID-5 setup just wasn't big enough. I've got 2 1.5TB and a 1TB drive in a RAID-5 of the first 1TB across them all using madam. The extra space on the 1.5's lets me use that area as scratch disks for other things that I don't especially need the speed or resilience of the RAID-5 for. But now it's time to throw another TB at it and make it a 4 drive RAID-5. Mdadm can help us out here in just a few steps. First, though, we have to partition the new drive so we can use it. We also need to know which one it is... First I'll see what we had. I know it's mounted on /array/ and can tell from a mount output what the device name is, and extrapolate what drives are part of it like this:

# mount |grep array
/dev/md3 on /array type ext4 (rw,noatime,data=ordered)

# mdadm --misc --detail /dev/sd3
/dev/md3:
        Version : 0.91
  Creation Time : Sun Jul 17 21:20:35 2011
     Raid Level : raid5
...
    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
       2       8       97        2      active sync   /dev/sdg1

So, sdc1, sdd1, sdg1 are all part of this array. After inserting the new disk, I run a `dmesg|grep TB` as I know it will be listed as a #TB drive and we'll look for other devices:
# dmesg |grep TB
[    3.264540] sd 2:0:0:0: [sdc] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
[    3.329286] sd 2:0:1:0: [sdd] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
[    3.329630] sd 3:0:0:0: [sde] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    5.930020] sd 9:0:0:0: [sdg] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)

Hey look, a new one, and we shall call you 'sde'. Time to make some partitions:

# fdisk /dev/sde

Welcome to fdisk (util-linux 2.26.2).
Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Device does not contain a recognized partition table. Created a new DOS disklabel with disk identifier 0x8c82f3b1. Command (m for help): p Disk /dev/sde: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0x8c82f3b1 Command (m for help): n Partition type p primary (0 primary, 0 extended, 4 free) e extended (container for logical partitions) Select (default p): p Partition number (1-4, default 1): 1 First sector (2048-1953525167, default 2048): 2048 Last sector, +sectors or +size{K,M,G,T,P} (2048-1953525167, default 1953525167): 1953525167 Created a new partition 1 of type 'Linux' and of size 931.5 GiB. Command (m for help): w The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks.
So, above, we hit 'p' to print what was there. All it did was tell us about the disk again because there weren't any existing partitions... This is still a good check to make sure that there's nothing on the drive already and that it's really the disk you want. If this isn't a new drive, and you want to clear it out first, you'll need to simply hit 'd' and pick the number corresponding to what you want to delete until there aren't any left. We then just created a default partition as the first and only one taking up the whole drive.

If you wanted to do as I did with the 1.5TB drives and create an array that doesn't take up the whole drive (like say I got a 2TB one, but wanted the first 1TB in this array), create a partition as normal, but change that "Last sector" number to be the same number as one of the other drives partitions. Running the fdisk /dev/<otherDevice> and hitting 'p' to print it's table and then 'q' to quit, will let you find out the sector count of that drive which you can just match here. Feel free to make extra partitions then for whatever else you want to do with the remainder of the disk.

# mdadm --grow --raid-devices=4 --add /dev/md3 /dev/sde1
mdadm: added /dev/sde1
mdadm: Need to backup 192K of critical section..

# top
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
  205 root      20   0       0      0      0 S  29.2  0.0   0:16.08 md3_raid5                                                                           
 4668 root      20   0       0      0      0 D  11.0  0.0   0:04.95 md3_resync    
...
You can see that mdadm is using up some decent CPU now (this is a Quad Core 2Ghz Core 2 based Pentium D), crunching all those raid-5 checksums.

# iostat -m 1 100
 
Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               0.00         0.00         0.00          0          0
sdb               0.00         0.00         0.00          0          0
sdc             162.00        46.23        32.00         46         32
sdd             164.00        47.38        32.00         47         32
sde              91.00         0.00        29.83          0         29
md3               0.00         0.00         0.00          0          0
sdg             151.00        44.73        28.00         44         28

And iostat shows that sdc, sdd, sdg and the new sde are all moving lots of MB/sec. Interestingly, since sde is new, you can tell it's not being read from, only written to.

If you want to see detailed progress, you can run this:

# watch cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10] 
md128 : active raid1 sdb1[1] sda1[0]
      112972800 blocks super 1.2 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md3 : active raid5 sde1[3] sdg1[2] sdd1[1] sdc1[0]
      1953519872 blocks super 0.91 level 5, 32k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  reshape =  3.7% (36999992/976759936) finish=504.7min speed=31026K/sec

Here you can see both 'md128' RAID-1 my boot drive mirror (sda1 and sdb1), as well as the now expanding RAID-5 'md3' using sdg1, sdd1, sdc1, and of course, the new sde1. Because that's run via 'watch' it'll update every 2 seconds by default. Taking the 'watch' off the front will give you a 1 time status page.

And now we wait... about 504.7min, apparently. ... Finally, you'll see:

# top
# cat /proc/mdstat
...
md3 : active raid5 sde1[3] sdg1[2] sdd1[1] sdc1[0]
      2930279808 blocks level 5, 32k chunk, algorithm 2 [4/4] [UUUU]
...
# dmesg | tail
...
[42646.351875] md: md3: reshape done.
[42648.156073] RAID conf printout:
[42648.156078]  --- level:5 rd:4 wd:4
[42648.156081]  disk 0, o:1, dev:sdc1
[42648.156084]  disk 1, o:1, dev:sdd1
[42648.156086]  disk 2, o:1, dev:sdg1
[42648.156088]  disk 3, o:1, dev:sde1
[42648.156094] md3: detected capacity change from 2000404348928 to 3000606523392
[42649.508764] VFS: busy inodes on changed media or resized disk md3

But our file system according to df still shows what it was. Well, the filesystem and it's allocation table were written while it was smaller, so, if it's formatted with any of the ext filesystem types, it can be enlarged that with the following commands.

# resize2fs /dev/md3
resize2fs 1.42.12 (29-Aug-2014)
Filesystem at /dev/md3 is mounted on /array; on-line resizing required
old_desc_blocks = 117, new_desc_blocks = 175
The filesystem on /dev/md3 is now 732569952 (4k) blocks long.
# dmesg|tail
...
[52020.706909] EXT4-fs (md3): resizing filesystem from 488379968 to 732569952 blocks
[52023.727545] EXT4-fs (md3): resized filesystem to 732569952

Huzzah! We have our space! Running a quick df will also show the capacity increase! It's online and ready to use. Hope this helped you, and thanks for reading!

2015-06-05

Resizing Online Disks in Linux with LVM and No Reboots

When we set this up, we only had a 16GB primary disk... Plenty of space. Until you start to write lots of logs and data... and then it fills up quick. So let's talk about how to resize an LVM based partition on a live server without reboots... Reboots are for Windows! This system is a CentOS 6.x machine running in VMWare 5.x that's currently got a 16GiB VMDK based drive. Let's see what we've got to work with:
$ df -h
Filesystem                                         Size  Used Avail Use% Mounted on
/dev/mapper/myfulldisk--vg-root                     12G   11G     0 100% /
none                                               4.0K     0  4.0K   0% /sys/fs/cgroup
udev                                               7.4G  4.0K  7.4G   1% /dev
tmpfs                                              1.5G  572K  1.5G   1% /run
none                                               5.0M     0  5.0M   0% /run/lock
none                                               7.4G     0  7.4G   0% /run/shm
none                                               100M     0  100M   0% /run/user
/dev/sda1                                          236M   37M  187M  17% /boot
/dev/sdc1                                          246G   44G  190G  19% /data

Hmm... time to get on this then. Now, luckily we're running in VMWare. A quick edit to our VM to enlarge the VMDK (not covered in this how-to) will fix this... First, what device are we talking about?

$ dmesg|grep sd
[    1.562363] sd 2:0:0:0: Attached scsi generic sg1 type 0
[    1.562384] sd 2:0:0:0: [sda] 33554432 512-byte logical blocks: (17.1 GB/16.0 GiB)
[    1.562425] sd 2:0:0:0: [sda] Write Protect is off
[    1.562426] sd 2:0:0:0: [sda] Mode Sense: 61 00 00 00
[    1.562460] sd 2:0:0:0: [sda] Cache data unavailable
[    1.562461] sd 2:0:0:0: [sda] Assuming drive cache: write through
[    1.563331] sd 2:0:0:0: [sda] Cache data unavailable
[    1.563451] sd 2:0:1:0: Attached scsi generic sg2 type 0
[    1.563452] sd 2:0:1:0: [sdb] 8388608 512-byte logical blocks: (4.29 GB/4.00 GiB)
[    1.563479] sd 2:0:1:0: [sdb] Write Protect is off
[    1.563481] sd 2:0:1:0: [sdb] Mode Sense: 61 00 00 00
[    1.563507] sd 2:0:1:0: [sdb] Cache data unavailable
[    1.563508] sd 2:0:1:0: [sdb] Assuming drive cache: write through
[    1.563755] sd 2:0:2:0: Attached scsi generic sg3 type 0
[    1.563881] sd 2:0:2:0: [sdc] 524288000 512-byte logical blocks: (268 GB/250 GiB)
[    1.563942] sd 2:0:2:0: [sdc] Write Protect is off
[    1.563944] sd 2:0:2:0: [sdc] Mode Sense: 61 00 00 00
[    1.564008] sd 2:0:2:0: [sdc] Cache data unavailable
[    1.564010] sd 2:0:2:0: [sdc] Assuming drive cache: write through
[    1.564282] sd 2:0:2:0: [sdc] Cache data unavailable
[    1.564283] sd 2:0:2:0: [sdc] Assuming drive cache: write through
[    1.564360] sd 2:0:1:0: [sdb] Cache data unavailable
[    1.564362] sd 2:0:1:0: [sdb] Assuming drive cache: write through
[    1.564989] sd 2:0:0:0: [sda] Assuming drive cache: write through
[    1.571010]  sdb: sdb1
[    1.571426] sd 2:0:1:0: [sdb] Cache data unavailable
[    1.571514] sd 2:0:1:0: [sdb] Assuming drive cache: write through
[    1.571626] sd 2:0:1:0: [sdb] Attached SCSI disk
[    1.574181]  sda: sda1 sda2 < sda5 >
[    1.574797] sd 2:0:0:0: [sda] Cache data unavailable
[    1.574888] sd 2:0:0:0: [sda] Assuming drive cache: write through
[    1.575003] sd 2:0:0:0: [sda] Attached SCSI disk
[    1.579250]  sdc: sdc1
[    1.579805] sd 2:0:2:0: [sdc] Cache data unavailable
[    1.579944] sd 2:0:2:0: [sdc] Assuming drive cache: write through
[    1.580141] sd 2:0:2:0: [sdc] Attached SCSI disk
[    6.922330] Adding 4193276k swap on /dev/sdb1.  Priority:-1 extents:1 across:4193276k FS
[    7.137134] EXT4-fs (sda1): mounting ext2 file system using the ext4 subsystem
[    7.142419] EXT4-fs (sda1): mounted filesystem without journal. Opts: (null)
[    7.218150] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts: (null)
[    7.384566] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).

The first one is the 16GB drive in question. Take the number on that line and use it in the next step:

$ echo 1 > /sys/class/scsi_device/2\:0\:0\:0/device/rescan
$ dmesg |tail
[1918441.322362] sd 2:0:0:0: [sda] 209715200 512-byte logical blocks: (107 GB/100 GiB)
[1918441.322596] sd 2:0:0:0: [sda] Cache data unavailable
[1918441.330685] sd 2:0:0:0: [sda] Assuming drive cache: write through
[1918441.489622] sda: detected capacity change from 17179869184 to 1073741824

So, that's good, it sees our increased size. Now, lets enlarge that Volume Group. First we get info about the volume group.

$ vgdisplay
  --- Volume group ---
  VG Name               myfulldisk-vg
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               15.76 GiB
  PE Size               4.00 MiB
  Total PE              4034
  Alloc PE / Size       4028 / 15.73 GiB
  Free  PE / Size       6 / 24.00 MiB
  VG UUID               dv3URd-EVvz-oTwY-WiDW-RPt1-4rbD-FnPxxM

That '6' is only 24MiB. It doesn't see our new space yet. In order to get it to, we need to make a new partition of the right type, then add it to the volume group. We'll then end up with more Free PE's. Here we go:
$ fdisk /dev/sda
Command (m for help): p

Disk /dev/sda: 107.4 GB, 107374182400 bytes
255 heads, 63 sectors/track, 13054 cylinders, total 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000ade37

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048      499711      248832   83  Linux
/dev/sda2          501758    33552383    16525313    5  Extended
/dev/sda5          501760    33552383    16525312   8e  Linux LVM

Command (m for help): n
Partition type:
   p   primary (1 primary, 1 extended, 2 free)
   l   logical (numbered from 5)
Select (default p): p
Partition number (1-4, default 3): 3
First sector (499712-209715199, default 499712): 33552384
Last sector, +sectors or +size{K,M,G} (33552384-209715199, default 209715199): 
Using default value 209715199

Command (m for help): t
Partition number (1-5): 3
Hex code (type L to list codes): 8e
Changed system type of partition 3 to 8e (Linux LVM)

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

At this point you could reboot... but we're not going to. Even though this is our root drive which makes this a little trickier, it's nothing we can't fix:
$ partprobe /dev/sda

Hopefully, partprobe has found your new partition for you and enlightened the kernel with it's wisdom (or at least fresh load of zeros). Now we need to make it an available volume to use for expanding the disk. This consists of making it a 'Physical volume', and then adding that physical volume to the Volume Group containing the disk we want to expand.

$ pvcreate /dev/sda3
  Physical volume "/dev/sda3" successfully created
$ pvdisplay
  --- Physical volume ---
  PV Name               /dev/sda5
  VG Name               myfulldisk-vg
  PV Size               15.76 GiB / not usable 2.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              4034
  Free PE               6
  Allocated PE          4028
  PV UUID               a3mhvZ-ogyk-ao4y-2JSM-KVfL-i9no-q0LAUk
   
  "/dev/sda3" is a new physical volume of "84.00 GiB"
  --- NEW Physical volume ---
  PV Name               /dev/sda3
  VG Name               
  PV Size               84.00 GiB
  Allocatable           NO
  PE Size               0   
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               IpyjOU-1GDy-bTLL-U9kE-iSGP-BYg1-a25LIm
   
$ vgextend /dev/myfulldisk-vg /dev/sda3
  Volume group "myfulldisk-vg" successfully extended
$ vgdisplay
  --- Volume group ---
  VG Name               myfulldisk-vg
  System ID             
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                2
  Act PV                2
  VG Size               99.76 GiB
  PE Size               4.00 MiB
  Total PE              25538
  Alloc PE / Size       4028 / 15.73 GiB
  Free  PE / Size       21510 / 84.02 GiB
  VG UUID               dv3URd-EVvz-oTwY-WiDW-RPt1-4rbD-FnPxxM
Awesome, we now have 21510 free PE's that we can use... That's, apparently, 84.02GB in this case. Next up, we'll need to know what portion of the VG we need to extend. Looking back up at a 'df' output, and knowing our system, it says "root" in there. doing a quick ls of /dev/myfulldisk-vg/ shows that there's only really a choice between root" and "swap". So, knowing it's root we move on with:
$ lvextend -L95G /dev/myfulldisk-vg/root
  Extending logical volume root to 95.00 GiB
  Logical volume root successfully resized
$ df -h
Filesystem                                         Size  Used Avail Use% Mounted on
/dev/mapper/myfulldisk--vg-root                     12G   11G  114M  99% /
none                                               4.0K     0  4.0K   0% /sys/fs/cgroup
udev                                               7.4G  4.0K  7.4G   1% /dev
tmpfs                                              1.5G  576K  1.5G   1% /run
none                                               5.0M     0  5.0M   0% /run/lock
none                                               7.4G     0  7.4G   0% /run/shm
none                                               100M     0  100M   0% /run/user
/dev/sda1                                          236M   37M  187M  17% /boot
/dev/sdc1                                          246G   44G  190G  19% /data
Okay, the VG might be bigger, but no one else knows that. Because the filesystem ON the VG is still the same size, luckily there's a command for that too!
$ resize2fs /dev/myfulldisk-vg/root
resize2fs 1.42.9 (4-Feb-2014)
Filesystem at /dev/myfulldisk-vg/root is mounted on /; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 6
The filesystem on /dev/myfulldisk-vg/root is now 24903680 blocks long.

$ df -h
Filesystem                                         Size  Used Avail Use% Mounted on
/dev/mapper/myfulldisk--vg-root                     94G   11G   79G  12% /
none                                               4.0K     0  4.0K   0% /sys/fs/cgroup
udev                                               7.4G  4.0K  7.4G   1% /dev
tmpfs                                              1.5G  576K  1.5G   1% /run
none                                               5.0M     0  5.0M   0% /run/lock
none                                               7.4G     0  7.4G   0% /run/shm
none                                               100M     0  100M   0% /run/user
/dev/sda1                                          236M   37M  187M  17% /boot
/dev/sdc1                                          246G   44G  190G  19% /data


Ha, there we go! 79GB available, enjoy!