Splitting files with dd

We have an ESXi box hosted with Rackspace, it took a bit of pushing to get them to install ESXi it in the first place as they tried to get us to use their cloud offering. But this is a staging environment and we want something dedicated on hardware we control so we can get an idea of performance without other people’s workloads muddying the water.

Anyway, I’ve been having a bit of fun getting our server template uploaded to it, which is only 11GB compressed – not exactly large, but apparently large enough to be inconvenient.

In my experience the datastore upload tool in the vSphere client frequently fails on large files. In this case I was getting the “Failed to log into NFC server” error, which is probably due to a requisite port not being open. I didn’t like that tool anyway, move on.

The trusty-but-slow scp method was also failing however. Uploads would start but consistently stall at about the 1GB mark. Not sure if it’s a buffer or something getting filled in dropbear (which is designed to be a lightweight ssh server and really shouldn’t need to deal with files this large), but Googling didn’t turn up much.

So I went down the track of splitting up the file into smaller chunks, easily done using the split tool. Except I didn’t know about the split tool and used dd.

So if you’re reading this you probably want to use split, but if for some reason you need to do this on an environment that doesn’t have split and does have dd (like ESXi), this could help:

[shell]
#!/bin/bash

FILE=”$1″

#How big we want the chunks to be in bytes
CHUNKSIZE=$(( 512 * 1024 * 1024 ))

#Block size for dd in bytes
BS=$(( 8 * 1024 ))

#Convert CHUNKSIZE to blocks
CHUNKSIZE=$(( $CHUNKSIZE / $BS ))

# Skip value for dd, we start at 0
SKIP=0

#Calculate total size of file in blocks
FSIZE=`stat -c%s “$1″`
SIZE=$(( $FSIZE / $BS ))

#Loop counter for file name
i=0

echo “Using chunks of “$CHUNKSIZE” blocks”
echo “Size is “$FSIZE” bytes = “$SIZE” blocks”

while [ $SKIP -le $SIZE ]
do
NEWFILE=$(printf “$FILE.part%03d” $i)
i=$(( $i + 1 ))

echo “Creating file “$NEWFILE” starting after block “$SKIP””
dd if=”$FILE” of=”$NEWFILE” bs=”$BS” count=”$CHUNKSIZE” skip=$SKIP

SKIP=$(( $SKIP + $CHUNKSIZE ))
done
[/shell]

Afterwards:

scp ./*.part* user@host:/vmfs/datastore/

Then at the other end you simply concatenate them together. I generated a list of files with `ls -tr1 *.part*` and simply pasted that into a script. Obviously the order is critical, but reverse sorting by time (which is what the r and t options do) gives the correct order.

[shell]
#!/bin/bash

#FLIST=`ls -tr1 *.part*`
FLIST=”devbox.tgz.part0 devbox.tgz.part1 devbox.tgz.part2 devbox.tgz.part3 devbox.tgz.part4 devbox.tgz.part5 devbox.tgz.part6 devbox.tgz.part7 devbox.tgz.part8 devbox.tgz.part9 devbox.tgz.part10 devbox.tgz.part11 devbox.tgz.part12 devbox.tgz.part13 devbox.tgz.part14 devbox.tgz.part15 devbox.tgz.part16 devbox.tgz.part17 devbox.tgz.part18 devbox.tgz.part19 devbox.tgz.part20 devbox.tgz.part21″

OUTPUT=”output.tgz”

for F in $FLIST
do
cat $F >> $OUTPUT
done
[/shell]

13 thoughts on “Splitting files with dd”

Matt Moldvan 20/05/2011 at 18:39

Thanks, helped me out on a FreeNAS box we have that doesn’t have split installed (actually doesn’t even have “man”). :P

Reply ↓

Alex Post author22/05/2011 at 16:10

Cheers, glad to hear it was useful to someone :razz:

Reply ↓

Greg Larkin 12/07/2011 at 20:07

Excellent, this saved me having to put a script together myself. I have been backing up huge vmdk files to Rackspace CloudFiles, and they have a 5GB per object limit.

Cheers,
Greg

Reply ↓

anonymoose 24/01/2012 at 18:20

Cheers for the tip. This came up on a search and was useful for recovering some simulations that I had accidentally appended to, rather than overwritten

Reply ↓

Alex Post author25/01/2012 at 12:12

Glad I could help!

Reply ↓

Henry 30/01/2014 at 23:44

Very good. This helped me put my VM backups on Blu-Ray DVDs.

Reply ↓

fuchur 05/09/2014 at 20:19

that script really helped me – thanks!

Just a little improvement for the split:
NEWFILE=$(printf “$FILE.part%03d” $i)

Reply ↓

Alex Post author06/09/2014 at 09:24

This makes the order correct when you ls and sort by filename, good addition thanks :-)

Reply ↓

Chad Jay Harrington 20/09/2016 at 16:09

Almost feel like you should separate the FILE and NEWFILE into two separate variables… had a little trouble with that one

Reply ↓

Chad Jay Harrington 20/09/2016 at 16:31

Here’s what I did to resolve my concerns… Also, in your post your files indicate that you ran another program as well (tgz)….

I also had problems with the printf of the FILE name putting quotes in the filenames at the beginning and end of every part…

I am trying to get an image of a 240GB SS HD and a 1 TB SS HD, split into pieces so I can burn the pieces onto ~25GB Bluray discs so i can do the Autopsy later when I have more money.

Here’s my code variant:

#!/bin/bash
FILE=$1
NEWFILE_STUB=$2

#How big we want the chunks to be in bytes
CHUNKSIZE=$(( 4096 * 1024 * 1024 ))
#Block size for dd in bytes
BS=$(( 8 * 1024 ))
#Convert CHUNKSIZE to blocks
CHUNKSIZE=$(( $CHUNKSIZE / $BS ))
# Skip value for dd, we start at 0
SKIP=0
#Calculate total size of file in blocks
FSIZE=`stat -c%s “$1″`
SIZE=$(( $FSIZE / $BS ))
#Loop counter for file name
i=0
echo “Using chunks of “$CHUNKSIZE” blocks”
echo “Size is “$FSIZE” bytes = “$SIZE” blocks”
while [ $SKIP -le $SIZE ]
do
NEWFILE=$(printf “$NEWFILE_STUB.part%03d” $i)
NEWFILE=basename $NEWFILE
i=$(( $i + 1 ))
echo “Creating file “$NEWFILE” starting after block “$SKIP””
dd if=$FILE of=$NEWFILE bs=”$BS” count=”$CHUNKSIZE” skip=$SKIP
SKIP=$(( $SKIP + $CHUNKSIZE ))
done

Reply ↓

Chad Jay Harrington 20/09/2016 at 16:38

In case of premature errors where all the pieces are not generated, perhaps we could provide an option that identifies that some of the parts are there already, and if we would like to regenerate them all or just the missing parts. and that way we don’t waste any time if not necessary

Reply ↓

Christian Rakotondratsima 01/08/2020 at 07:19

Hello Alex,
We used it a lot, thank you.
Here is our variant, we faced some issue sometimes with the previous version.
I added also the merging, or concatenation.
It works for both, splitting and merging, if one uses the same name for the 2 parameters
– the splitting
===========================
#!/bin/bash
#From Al4, Alex https://blog.al4.co.nz/2011/03/esxi-and-splitting-a-file-with-dd/
#Improved by Chad Jay Harrington
#Improved by CRA Miro K.E.
FILE=$1
NEWFILE_STUB=$2
#How big we want the chunks to be in bytes, here is for 256 mb
CHUNKSIZE=$(( 256 * 1024 * 1024 ))
#Block size for dd in bytes
BS=$(( 8 * 1024 ))
echo “block size: “$BS
#Convert CHUNKSIZE to blocks
CHUNKSIZE=$(( $CHUNKSIZE / $BS ))
echo “chunksize: “$CHUNKSIZE
# Skip value for dd, we start at 0
SKIP=0
#Calculate total size of file in blocks
FSIZE=$(stat -c%s “$1”)
echo “file size: “$FSIZE
SIZE=$(( $FSIZE / $BS ))
#Loop counter for file name
i=0
echo “Using chunks of “$CHUNKSIZE” blocks”
echo “Size is “$FSIZE” bytes = “$SIZE” blocks”
while [ $SKIP -le $SIZE ]
do
NEWFILE=$(printf “$NEWFILE_STUB.part%05d” $i)
NEWFILE=basename $NEWFILE
i=$(( $i + 1 ))
echo “Creating file “$NEWFILE” starting after block “$SKIP””
dd if=$FILE of=$NEWFILE bs=”$BS” count=”$CHUNKSIZE” skip=$SKIP
SKIP=$(( $SKIP + $CHUNKSIZE ))
done
===========================
– the merging
===========================
#!/bin/bash
SPLITTED=$1
MERGED=$2
FLIST=$(ls -tr1 $SPLITTED*)
OUTPUT=$MERGED
for F in $FLIST
do
cat $F >> $OUTPUT
done
===========================
All the best.
Christian

Reply ↓

Alex Forbes Post author01/08/2020 at 08:37

Hi Christian, thanks for the contribution!

Reply ↓

Matt Moldvan 20/05/2011 at 18:39

Thanks, helped me out on a FreeNAS box we have that doesn’t have split installed (actually doesn’t even have “man”). :P

Reply ↓
1. Alex Post author22/05/2011 at 16:10
  
  Cheers, glad to hear it was useful to someone :razz:
  
  Reply ↓
Greg Larkin 12/07/2011 at 20:07

Excellent, this saved me having to put a script together myself. I have been backing up huge vmdk files to Rackspace CloudFiles, and they have a 5GB per object limit.

Cheers,
Greg

Reply ↓
anonymoose 24/01/2012 at 18:20

Cheers for the tip. This came up on a search and was useful for recovering some simulations that I had accidentally appended to, rather than overwritten

Reply ↓
1. Alex Post author25/01/2012 at 12:12
  
  Glad I could help!
  
  Reply ↓
Henry 30/01/2014 at 23:44

Very good. This helped me put my VM backups on Blu-Ray DVDs.

Reply ↓
fuchur 05/09/2014 at 20:19

that script really helped me – thanks!

Just a little improvement for the split:
NEWFILE=$(printf “$FILE.part%03d” $i)

Reply ↓
1. Alex Post author06/09/2014 at 09:24
  
  This makes the order correct when you ls and sort by filename, good addition thanks :-)
  
  Reply ↓
Chad Jay Harrington 20/09/2016 at 16:09

Almost feel like you should separate the FILE and NEWFILE into two separate variables… had a little trouble with that one

Reply ↓
1. Chad Jay Harrington 20/09/2016 at 16:31
  
  Here’s what I did to resolve my concerns… Also, in your post your files indicate that you ran another program as well (tgz)….
  
  I also had problems with the printf of the FILE name putting quotes in the filenames at the beginning and end of every part…
  
  I am trying to get an image of a 240GB SS HD and a 1 TB SS HD, split into pieces so I can burn the pieces onto ~25GB Bluray discs so i can do the Autopsy later when I have more money.
  
  Here’s my code variant:
  
  #!/bin/bash
  FILE=$1
  NEWFILE_STUB=$2
  
  #How big we want the chunks to be in bytes
  CHUNKSIZE=$(( 4096 * 1024 * 1024 ))
  #Block size for dd in bytes
  BS=$(( 8 * 1024 ))
  #Convert CHUNKSIZE to blocks
  CHUNKSIZE=$(( $CHUNKSIZE / $BS ))
  # Skip value for dd, we start at 0
  SKIP=0
  #Calculate total size of file in blocks
  FSIZE=`stat -c%s “$1″`
  SIZE=$(( $FSIZE / $BS ))
  #Loop counter for file name
  i=0
  echo “Using chunks of “$CHUNKSIZE” blocks”
  echo “Size is “$FSIZE” bytes = “$SIZE” blocks”
  while [ $SKIP -le $SIZE ]
  do
  NEWFILE=$(printf “$NEWFILE_STUB.part%03d” $i)
  NEWFILE=basename $NEWFILE
  i=$(( $i + 1 ))
  echo “Creating file “$NEWFILE” starting after block “$SKIP””
  dd if=$FILE of=$NEWFILE bs=”$BS” count=”$CHUNKSIZE” skip=$SKIP
  SKIP=$(( $SKIP + $CHUNKSIZE ))
  done
  
  Reply ↓
Chad Jay Harrington 20/09/2016 at 16:38

In case of premature errors where all the pieces are not generated, perhaps we could provide an option that identifies that some of the parts are there already, and if we would like to regenerate them all or just the missing parts. and that way we don’t waste any time if not necessary

Reply ↓
Christian Rakotondratsima 01/08/2020 at 07:19

Hello Alex,
We used it a lot, thank you.
Here is our variant, we faced some issue sometimes with the previous version.
I added also the merging, or concatenation.
It works for both, splitting and merging, if one uses the same name for the 2 parameters
– the splitting
===========================
#!/bin/bash
#From Al4, Alex https://blog.al4.co.nz/2011/03/esxi-and-splitting-a-file-with-dd/
#Improved by Chad Jay Harrington
#Improved by CRA Miro K.E.
FILE=$1
NEWFILE_STUB=$2
#How big we want the chunks to be in bytes, here is for 256 mb
CHUNKSIZE=$(( 256 * 1024 * 1024 ))
#Block size for dd in bytes
BS=$(( 8 * 1024 ))
echo “block size: “$BS
#Convert CHUNKSIZE to blocks
CHUNKSIZE=$(( $CHUNKSIZE / $BS ))
echo “chunksize: “$CHUNKSIZE
# Skip value for dd, we start at 0
SKIP=0
#Calculate total size of file in blocks
FSIZE=$(stat -c%s “$1”)
echo “file size: “$FSIZE
SIZE=$(( $FSIZE / $BS ))
#Loop counter for file name
i=0
echo “Using chunks of “$CHUNKSIZE” blocks”
echo “Size is “$FSIZE” bytes = “$SIZE” blocks”
while [ $SKIP -le $SIZE ]
do
NEWFILE=$(printf “$NEWFILE_STUB.part%05d” $i)
NEWFILE=basename $NEWFILE
i=$(( $i + 1 ))
echo “Creating file “$NEWFILE” starting after block “$SKIP””
dd if=$FILE of=$NEWFILE bs=”$BS” count=”$CHUNKSIZE” skip=$SKIP
SKIP=$(( $SKIP + $CHUNKSIZE ))
done
===========================
– the merging
===========================
#!/bin/bash
SPLITTED=$1
MERGED=$2
FLIST=$(ls -tr1 $SPLITTED*)
OUTPUT=$MERGED
for F in $FLIST
do
cat $F >> $OUTPUT
done
===========================
All the best.
Christian

Reply ↓
1. Alex Forbes Post author01/08/2020 at 08:37
  
  Hi Christian, thanks for the contribution!
  
  Reply ↓

Al4

My hobby…

Splitting files with dd

13 thoughts on “Splitting files with dd”

Leave a ReplyCancel reply

Share this:

13 thoughts on “Splitting files with dd”

Leave a ReplyCancel reply