Ticket #7 (new task)

Opened 2 years ago

Last modified 21 months ago

Improve restore speed on local repositories

Reported by: martin Owned by:
Priority: minor Milestone: 0.20
Component: box libraries Version: 0.10
Keywords: Cc:

Description

Due to the synchronous nature of the protocol, restore performance is very poor on local backup stores (especially if local disk). I recently restored a couple of multi-GB backups and it took forever on local disk. CPU was 98% idle and IO speed was about 250kB/s, this on disks that do 60GB/s read and write normally.

Although Box Backup was designed for use over slow links it is equally useful over fast links and should support them to their potential.

To resolve this we need to allow streaming of requests rather than requiring an ack after each one.

Note that this also affects backup performance, but that is less interesting.

Change History

Changed 2 years ago by martin

  • type changed from enhancement to task

Changed 2 years ago by chris

I think there is another problem here: bbstored is quite inefficient in how it handles directories. During a compare, it spends most of its time reading directories:

stat64("/mnt/tmp/boxbackup/backup/00000002/54/0e/o8a.rfw", {st_mode=S_IFREG|0664, st_size=497, ...}) = 0 open("/mnt/tmp/boxbackup/backup/00000002/54/0e/o8a.rfw", O_RDONLY|O_LARGEFILE) = 6 fstat64(6, {st_mode=S_IFREG|0664, st_size=497, ...}) = 0 read(6, "DIR_\0\0\0\6\0\0\0\0\0\16T\212\0\0\0\0\0\16Tp\0\3\357b"..., 36) = 36 read(6, "\0\0\0001", 4) = 4 read(6, "\2WN\10\377\231\205\264\33\331\362\266\214\22\16+h\326"..., 49) = 49 read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\213\0\0\0\0\0\0\0"..., 34) = 34 read(6, "\212\0", 2) = 2 read(6, "\340zWk\204\35\371\327\277\247\356\353\346`k\346\32\324"..., 32) = 32 read(6, "\0\0\0\0", 4) = 4 read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\214\0\0\0\0\0\0\0"..., 34) = 34 read(6, "\212\0", 2) = 2 read(6, "\2259o\253X\246P\242\263\4\27\310\243\2171\331\340b\222"..., 32) = 32 read(6, "\0\0\0\0", 4) = 4 read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\215\0\0\0\0\0\0\0"..., 34) = 34 read(6, "j\0", 2) = 2 read(6, "\311N\320a$\345\337\313\231F\207I\253X\310;W\5\335\10v"..., 24) = 24 read(6, "\0\0\0\0", 4) = 4 read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\216\0\0\0\0\0\0\0"..., 34) = 34 read(6, "\212\0", 2) = 2 read(6, "\320\tHOM\'\232\266\27U\322U\203\323L\227\336\252\263P"..., 32) = 32 read(6, "\0\0\0\0", 4) = 4 read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\217\0\0\0\0\0\0\0"..., 34) = 34 read(6, "\212\0", 2) = 2 read(6, "|\225\224\'\f\273\tr\376\255\321\325:7\356{e\23x\317gs"..., 32) = 32 read(6, "\0\0\0\0", 4) = 4 read(6, "\0\3\331\304q\22\262\0\0\0\0\0\0\16T\220\0\0\0\0\0\0\0"..., 34) = 34 read(6, "J\0", 2) = 2 read(6, "\\\3405\214\367\317>+#\235\350\351\203\356\325\360", 16) = 16 read(6, "\0\0\0\0", 4) = 4 close(6)

Reading 34 bytes at a time is just no fun. I think we should have a caching reader (like a BufferedReader? in Java) that reads 4k blocks each time.

Changed 2 years ago by chris

Reformat that strace for readability:

stat64("/mnt/tmp/boxbackup/backup/00000002/54/0e/o8a.rfw", {st_mode=S_IFREG|0664, st_size=497, ...}) = 0
open("/mnt/tmp/boxbackup/backup/00000002/54/0e/o8a.rfw", O_RDONLY|O_LARGEFILE) = 6
fstat64(6, {st_mode=S_IFREG|0664, st_size=497, ...}) = 0
read(6, "DIR_\0\0\0\6\0\0\0\0\0\16T\212\0\0\0\0\0\16Tp\0\3\357b"..., 36) = 36
read(6, "\0\0\0001", 4)                 = 4
read(6, "\2WN\10\377\231\205\264\33\331\362\266\214\22\16+h\326"..., 49) = 49
read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\213\0\0\0\0\0\0\0"..., 34) = 34
read(6, "\212\0", 2)                    = 2
read(6, "\340zWk\204\35\371\327\277\247\356\353\346`k\346\32\324"..., 32) = 32
read(6, "\0\0\0\0", 4)                  = 4
read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\214\0\0\0\0\0\0\0"..., 34) = 34
read(6, "\212\0", 2)                    = 2
read(6, "\2259o\253X\246P\242\263\4\27\310\243\2171\331\340b\222"..., 32) = 32
read(6, "\0\0\0\0", 4)                  = 4
read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\215\0\0\0\0\0\0\0"..., 34) = 34
read(6, "j\0", 2)                       = 2
read(6, "\311N\320a$\345\337\313\231F\207I\253X\310;W\5\335\10v"..., 24) = 24
read(6, "\0\0\0\0", 4)                  = 4
read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\216\0\0\0\0\0\0\0"..., 34) = 34
read(6, "\212\0", 2)                    = 2
read(6, "\320\tHOM\'\232\266\27U\322U\203\323L\227\336\252\263P"..., 32) = 32
read(6, "\0\0\0\0", 4)                  = 4
read(6, "\0\3\3278\366&\244\200\0\0\0\0\0\16T\217\0\0\0\0\0\0\0"..., 34) = 34
read(6, "\212\0", 2)                    = 2
read(6, "|\225\224\'\f\273\tr\376\255\321\325:7\356{e\23x\317gs"..., 32) = 32
read(6, "\0\0\0\0", 4)                  = 4
read(6, "\0\3\331\304q\22\262\0\0\0\0\0\0\16T\220\0\0\0\0\0\0\0"..., 34) = 34
read(6, "J\0", 2)                       = 2
read(6, "\\\3405\214\367\317>+#\235\350\351\203\356\325\360", 16) = 16
read(6, "\0\0\0\0", 4)                  = 4
close(6)
Note: See TracTickets for help on using tickets.