旅する情報系大学院生

旅と留学とプログラミング

32bitCPUでpkcrackを動かす話

この記事は、TSG Advent Calendar 2016 - AdventarIS17er Advent Calendar 2016 - Adventarの7日目の記事として書かれました。

駒場祭でpkcrack問を出すために、ARMでpkcrackを動かさねばならず、そのままではセグフォしたので知見を共有します。
ラズパイ2までなど、32bitのアーキテクチャで動かす必要があるときに読んで下さい。

結論から言うと

pkcrack-1.2.2/src/exfunc.cの150行目を、以下のように書き換えてからmake -Bして下さい。

`--> diff before_exfunc.c after_exfunc.c -u
--- before_exfunc.c	2016-12-07 16:06:08.297691503 +0900
+++ after_exfunc.c	2016-12-07 16:05:59.269691329 +0900
@@ -147,7 +147,8 @@
   }
   else;
 
-  data = malloc( lh.csize );
+  data = malloc( lh.csize + 100);
+  data += 100;
 
   if(!data)
   {

美しくない解決法であることは事実です・・・。が、元々のソースコードもmallocのチャンクを破壊してたりfreeしてなかったりとなかなか美しくないので、許して下さい。


原因を深堀してみる

上記の書き換えを行わないと、こんなエラーが出ます。

`--> ~/tools/pkcrack-1.2.2/src/pkcrack -p buho310.pdf -P buho.zip -C easy_3.zip -c buho310.pdf -d dec.zip
Files read. Starting stage 1 on Wed Dec  7 13:42:07 2016
Generating 1st generation of possible key2_633361 values...done.
Found 4194304 possible key2-values.
Now we're trying to reduce these...
Lowest number: 974 values at offset 629645
Lowest number: 929 values at offset 629483
Lowest number: 914 values at offset 629400
Lowest number: 872 values at offset 629227
Lowest number: 845 values at offset 629213
Lowest number: 818 values at offset 629211
Lowest number: 817 values at offset 629151
Lowest number: 763 values at offset 629138
Lowest number: 699 values at offset 629111
Lowest number: 690 values at offset 629106
Lowest number: 684 values at offset 623608
Lowest number: 671 values at offset 623577
Lowest number: 640 values at offset 623571
Lowest number: 630 values at offset 623567
Lowest number: 628 values at offset 623556
Lowest number: 608 values at offset 623547
Lowest number: 549 values at offset 623529
Lowest number: 519 values at offset 623524
Lowest number: 497 values at offset 623517
Lowest number: 452 values at offset 623515
Lowest number: 447 values at offset 623514
Lowest number: 441 values at offset 623513
Lowest number: 425 values at offset 623507
Lowest number: 413 values at offset 623505
Lowest number: 401 values at offset 623481
Lowest number: 384 values at offset 623480
Lowest number: 378 values at offset 623353
Lowest number: 374 values at offset 623271
Lowest number: 371 values at offset 623258
Lowest number: 361 values at offset 623257
Lowest number: 355 values at offset 623222
Lowest number: 340 values at offset 623181
Lowest number: 325 values at offset 623173
Lowest number: 314 values at offset 623172
Lowest number: 310 values at offset 623171
Lowest number: 307 values at offset 623168
Lowest number: 285 values at offset 623167
Lowest number: 268 values at offset 600883
Lowest number: 252 values at offset 600869
Lowest number: 245 values at offset 600867
Lowest number: 222 values at offset 600866
Lowest number: 221 values at offset 600845
Lowest number: 216 values at offset 600829
Lowest number: 206 values at offset 567363
Lowest number: 199 values at offset 567362
Lowest number: 186 values at offset 567338
Lowest number: 178 values at offset 567271
Lowest number: 172 values at offset 567270
Lowest number: 171 values at offset 567269
Lowest number: 164 values at offset 567267
Lowest number: 154 values at offset 567266
Lowest number: 150 values at offset 567263
Lowest number: 149 values at offset 567259
Lowest number: 144 values at offset 567257
Lowest number: 138 values at offset 567256
Lowest number: 137 values at offset 567244
Lowest number: 133 values at offset 567243
Lowest number: 114 values at offset 567185
Lowest number: 113 values at offset 567130
Lowest number: 100 values at offset 567121
Done. Left with 100 possible Values. bestOffset is 567121.
Stage 1 completed. Starting stage 2 on Wed Dec  7 13:44:25 2016
zsh: segmentation fault (core dumped)  ~/tools/pkcrack-1.2.2/src/pkcrack -p buho310.pdf -P buho.zip -C easy_3.zip -c

gdbで実行しながらデバッグしていると、以下のことがわかります。
1.main.c(下記)の231行目でextractの戻り値をplaintextに代入して、235行目でextractの戻り値からoffsetを引いている(!)このextraxt関数の中身がexfunc.c。

231     plaintext = extract( pFromZIP, plainname, caseflg );
232     if( !plaintext )
233         exit(1);
234     plainlength = lh.csize;
235     plaintext -= offset;                                                        
236     }

2.gdbで見てみると、offsetの中身は0xc。
3.plaintext[0]~plaintext[3]にアクセスしようとするとできない。(4以降はできる!)
4.extractの返り値はmalloc( lh.csize );

32bitだとmalloc chunkがprev_sizeの4byte+sizeの4byteで8byteしかないので、12byte引くとダメなのかなと思い、上記のように修正したら治りました。
64bitだとmalloc chunkが16byteなので、12byte引いても問題なかったと考えられます。plaintext[0]~plaintext[3]にアクセスできなかったことからも、妥当かなと思います。

malloc chunkの実装を置いておきます。

1104 struct malloc_chunk {
1105 
1106   INTERNAL_SIZE_T      prev_size;  /* Size of previous chunk (if free).  */
1107   INTERNAL_SIZE_T      size;       /* Size in bytes, including overhead. */
1108 
1109   struct malloc_chunk* fd;         /* double links -- used only if free. */
1110   struct malloc_chunk* bk;
1111 
1112   /* Only used for large blocks: pointer to next larger size.  */
1113   struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
1114   struct malloc_chunk* bk_nextsize;
1115 };
1116 
1117 
1118 /*
1119    malloc_chunk details:
1120 
1121     (The following includes lightly edited explanations by Colin Plumb.)
1122 
1123     Chunks of memory are maintained using a `boundary tag' method as
1124     described in e.g., Knuth or Standish.  (See the paper by Paul
1125     Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
1126     survey of such techniques.)  Sizes of free chunks are stored both
1127     in the front of each chunk and at the end.  This makes
1128     consolidating fragmented chunks into bigger chunks very fast.  The
1129     size fields also hold bits representing whether chunks are free or
1130     in use.
1131 
1132     An allocated chunk looks like this:
1133 
1134 
1135     chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1136         |             Size of previous chunk, if allocated            | |
1137         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1138         |             Size of chunk, in bytes                       |M|P|
1139       mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1140         |             User data starts here...                          .
1141         .                                                               .
1142         .             (malloc_usable_size() bytes)                      .
1143         .                                                               |
1144 nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1145         |             Size of chunk                                     |
1146         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1147 
1148 
1149     Where "chunk" is the front of the chunk for the purpose of most of
1150     the malloc code, but "mem" is the pointer that is returned to the
1151     user.  "Nextchunk" is the beginning of the next contiguous chunk.
1152 
1153     Chunks always begin on even word boundaries, so the mem portion
1154     (which is returned to the user) is also on an even word boundary, and
1155     thus at least double-word aligned.
1156 
1157     Free chunks are stored in circular doubly-linked lists, and look like this:
1158 
1159     chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1160         |             Size of previous chunk                            |
1161         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1162     `head:' |             Size of chunk, in bytes                         |P|
1163       mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1164         |             Forward pointer to next chunk in list             |
1165         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1166         |             Back pointer to previous chunk in list            |
1167         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1168         |             Unused space (may be 0 bytes long)                .
1169         .                                                               .
1170         .                                                               |
1171 nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1172     `foot:' |             Size of chunk, in bytes                           |
1173         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1174 
1175     The P (PREV_INUSE) bit, stored in the unused low-order bit of the
1176     chunk size (which is always a multiple of two words), is an in-use
1177     bit for the *previous* chunk.  If that bit is *clear*, then the
1178     word before the current chunk size contains the previous chunk
1179     size, and can be used to find the front of the previous chunk.
1180     The very first chunk allocated always has this bit set,
1181     preventing access to non-existent (or non-owned) memory. If
1182     prev_inuse is set for any given chunk, then you CANNOT determine
1183     the size of the previous chunk, and might even get a memory
1184     addressing fault when trying to do so.
1185 
1186     Note that the `foot' of the current chunk is actually represented
1187     as the prev_size of the NEXT chunk. This makes it easier to
1188     deal with alignments etc but can be very confusing when trying
1189     to extend or adapt this code.
1190 
1191     The two exceptions to all this are
1192 
1193      1. The special chunk `top' doesn't bother using the
1194     trailing size field since there is no next contiguous chunk
1195     that would have to index off it. After initialization, `top'
1196     is forced to always exist.  If it would become less than
1197     MINSIZE bytes long, it is replenished.
1198 
1199      2. Chunks allocated via mmap, which have the second-lowest-order
1200     bit M (IS_MMAPPED) set in their size fields.  Because they are
1201     allocated one-by-one, each must contain its own trailing size field.
1202 
1203 */


感想

楽しかったです(白目)
コメントお待ちしております。