replace zxq.co/ripple/hanayo

2019-02-23 13:29:15 +00:00
commit c3d206c173
5871 changed files with 1353715 additions and 0 deletions
--- a/vendor/github.com/klauspost/compress/.gitignore
+++ b/vendor/github.com/klauspost/compress/.gitignore
@@ -0,0 +1,24 @@
+# Compiled Object files, Static and Dynamic libs (Shared Objects)
+*.o
+*.a
+*.so
+
+# Folders
+_obj
+_test
+
+# Architecture specific extensions/prefixes
+*.[568vq]
+[568vq].out
+
+*.cgo1.go
+*.cgo2.c
+_cgo_defun.c
+_cgo_gotypes.go
+_cgo_export.*
+
+_testmain.go
+
+*.exe
+*.test
+*.prof
--- a/vendor/github.com/klauspost/compress/.travis.yml
+++ b/vendor/github.com/klauspost/compress/.travis.yml
@@ -0,0 +1,24 @@
+language: go
+
+sudo: false
+
+os:
+  - linux
+  - osx
+
+go:
+  - 1.4
+  - 1.5
+  - 1.6
+  - 1.7
+  - tip
+
+install:
+ - go get -t ./...
+
+script:
+ - diff <(gofmt -d .) <(printf "")
+ - go test -v -cpu=2 ./...
+ - go test -cpu=2 -tags=noasm ./...
+ - go test -cpu=1,2,4 -short -race ./...
+ - go test -cpu=2,4 -short -race -tags=noasm ./...
--- a/vendor/github.com/klauspost/compress/LICENSE
+++ b/vendor/github.com/klauspost/compress/LICENSE
@@ -0,0 +1,27 @@
+Copyright (c) 2012 The Go Authors. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+   * Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+   * Redistributions in binary form must reproduce the above
+copyright notice, this list of conditions and the following disclaimer
+in the documentation and/or other materials provided with the
+distribution.
+   * Neither the name of Google Inc. nor the names of its
+contributors may be used to endorse or promote products derived from
+this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/vendor/github.com/klauspost/compress/README.md
+++ b/vendor/github.com/klauspost/compress/README.md
@@ -0,0 +1,289 @@
+# compress
+
+This package is based on an optimized Deflate function, which is used by gzip/zip/zlib packages.
+
+It offers slightly better compression at lower compression settings, and up to 3x faster encoding at highest compression level.
+
+* [High Throughput Benchmark](http://blog.klauspost.com/go-gzipdeflate-benchmarks/).
+* [Small Payload/Webserver Benchmarks](http://blog.klauspost.com/gzip-performance-for-go-webservers/).
+* [Linear Time Compression](http://blog.klauspost.com/constant-time-gzipzip-compression/).
+* [Re-balancing Deflate Compression Levels](https://blog.klauspost.com/rebalancing-deflate-compression-levels/)
+
+[![Build Status](https://travis-ci.org/klauspost/compress.svg?branch=master)](https://travis-ci.org/klauspost/compress)
+
+# changelog
+* Jan 14, 2017: Reduce stack pressure due to array copies. See [Issue #18625(https://github.com/golang/go/issues/18625).
+* Oct 25, 2016: Level 2-4 have been rewritten and now offers significantly better performance than before.
+* Oct 20, 2016: Port zlib changes from Go 1.7 to fix zlib writer issue. Please update.
+* Oct 16, 2016: Go 1.7 changes merged. Apples to apples this package is a few percent faster, but has a significantly better balance between speed and compression per level. 
+* Mar 24, 2016: Always attempt Huffman encoding on level 4-7. This improves base 64 encoded data compression.
+* Mar 24, 2016: Small speedup for level 1-3.
+* Feb 19, 2016: Faster bit writer, level -2 is 15% faster, level 1 is 4% faster.
+* Feb 19, 2016: Handle small payloads faster in level 1-3.
+* Feb 19, 2016: Added faster level 2 + 3 compression modes.
+* Feb 19, 2016: [Rebalanced compression levels](https://blog.klauspost.com/rebalancing-deflate-compression-levels/), so there is a more even progresssion in terms of compression. New default level is 5.
+* Feb 14, 2016: Snappy: Merge upstream changes. 
+* Feb 14, 2016: Snappy: Fix aggressive skipping.
+* Feb 14, 2016: Snappy: Update benchmark.
+* Feb 13, 2016: Deflate: Fixed assembler problem that could lead to sub-optimal compression.
+* Feb 12, 2016: Snappy: Added AMD64 SSE 4.2 optimizations to matching, which makes easy to compress material run faster. Typical speedup is around 25%.
+* Feb 9, 2016: Added Snappy package fork. This version is 5-7% faster, much more on hard to compress content.
+* Jan 30, 2016: Optimize level 1 to 3 by not considering static dictionary or storing uncompressed. ~4-5% speedup.
+* Jan 16, 2016: Optimization on deflate level 1,2,3 compression.
+* Jan 8 2016: Merge [CL 18317](https://go-review.googlesource.com/#/c/18317): fix reading, writing of zip64 archives.
+* Dec 8 2015: Make level 1 and -2 deterministic even if write size differs.
+* Dec 8 2015: Split encoding functions, so hashing and matching can potentially be inlined. 1-3% faster on AMD64. 5% faster on other platforms.
+* Dec 8 2015: Fixed rare [one byte out-of bounds read](https://github.com/klauspost/compress/issues/20). Please update!
+* Nov 23 2015: Optimization on token writer. ~2-4% faster. Contributed by [@dsnet](https://github.com/dsnet).
+* Nov 20 2015: Small optimization to bit writer on 64 bit systems.
+* Nov 17 2015: Fixed out-of-bound errors if the underlying Writer returned an error. See [#15](https://github.com/klauspost/compress/issues/15).
+* Nov 12 2015: Added [io.WriterTo](https://golang.org/pkg/io/#WriterTo) support to gzip/inflate.
+* Nov 11 2015: Merged [CL 16669](https://go-review.googlesource.com/#/c/16669/4): archive/zip: enable overriding (de)compressors per file
+* Oct 15 2015: Added skipping on uncompressible data. Random data speed up >5x.
+
+# usage
+
+The packages are drop-in replacements for standard libraries. Simply replace the import path to use them:
+
+| old import         | new import                              |
+|--------------------|-----------------------------------------|
+| `compress/gzip`    | `github.com/klauspost/compress/gzip`    |
+| `compress/zlib`    | `github.com/klauspost/compress/zlib`    |
+| `archive/zip`      | `github.com/klauspost/compress/zip`     |
+| `compress/deflate` | `github.com/klauspost/compress/deflate` |
+
+You may also be interested in [pgzip](https://github.com/klauspost/pgzip), which is a drop in replacement for gzip, which support multithreaded compression on big files and the optimized [crc32](https://github.com/klauspost/crc32) package used by these packages.
+
+The packages contains the same as the standard library, so you can use the godoc for that: [gzip](http://golang.org/pkg/compress/gzip/), [zip](http://golang.org/pkg/archive/zip/),  [zlib](http://golang.org/pkg/compress/zlib/), [flate](http://golang.org/pkg/compress/flate/).
+
+Currently there is only minor speedup on decompression (mostly CRC32 calculation).
+
+# deflate optimizations
+
+* Minimum matches are 4 bytes, this leads to fewer searches and better compression. (In Go 1.7)
+* Stronger hash (iSCSI CRC32) for matches on x64 with SSE 4.2 support. This leads to fewer hash collisions. (Go 1.7 also has improved hashes)
+* Literal byte matching using SSE 4.2 for faster match comparisons. (not in Go)
+* Bulk hashing on matches. (In Go 1.7)
+* Much faster dictionary indexing with `NewWriterDict()`/`Reset()`. (In Go 1.7)
+* Make Bit Coder faster by assuming we are on a 64 bit CPU. (In Go 1.7)
+* Level 1 compression replaced by converted "Snappy" algorithm. (In Go 1.7)
+* Uncompressible content is detected and skipped faster. (Only in BestSpeed in Go)
+* A lot of branching eliminated by having two encoders for levels 4-6 and 7-9. (not in Go)
+* All heap memory allocations eliminated. (In Go 1.7)
+
+```
+benchmark                              old ns/op     new ns/op     delta
+BenchmarkEncodeDigitsSpeed1e4-4        554029        265175        -52.14%
+BenchmarkEncodeDigitsSpeed1e5-4        3908558       2416595       -38.17%
+BenchmarkEncodeDigitsSpeed1e6-4        37546692      24875330      -33.75%
+BenchmarkEncodeDigitsDefault1e4-4      781510        486322        -37.77%
+BenchmarkEncodeDigitsDefault1e5-4      15530248      6740175       -56.60%
+BenchmarkEncodeDigitsDefault1e6-4      174915710     76498625      -56.27%
+BenchmarkEncodeDigitsCompress1e4-4     769995        485652        -36.93%
+BenchmarkEncodeDigitsCompress1e5-4     15450113      6929589       -55.15%
+BenchmarkEncodeDigitsCompress1e6-4     175114660     73348495      -58.11%
+BenchmarkEncodeTwainSpeed1e4-4         560122        275977        -50.73%
+BenchmarkEncodeTwainSpeed1e5-4         3740978       2506095       -33.01%
+BenchmarkEncodeTwainSpeed1e6-4         35542802      21904440      -38.37%
+BenchmarkEncodeTwainDefault1e4-4       828534        549026        -33.74%
+BenchmarkEncodeTwainDefault1e5-4       13667153      7528455       -44.92%
+BenchmarkEncodeTwainDefault1e6-4       141191770     79952170      -43.37%
+BenchmarkEncodeTwainCompress1e4-4      830050        545694        -34.26%
+BenchmarkEncodeTwainCompress1e5-4      16620852      8460600       -49.10%
+BenchmarkEncodeTwainCompress1e6-4      193326820     90808750      -53.03%
+
+benchmark                              old MB/s     new MB/s     speedup
+BenchmarkEncodeDigitsSpeed1e4-4        18.05        37.71        2.09x
+BenchmarkEncodeDigitsSpeed1e5-4        25.58        41.38        1.62x
+BenchmarkEncodeDigitsSpeed1e6-4        26.63        40.20        1.51x
+BenchmarkEncodeDigitsDefault1e4-4      12.80        20.56        1.61x
+BenchmarkEncodeDigitsDefault1e5-4      6.44         14.84        2.30x
+BenchmarkEncodeDigitsDefault1e6-4      5.72         13.07        2.28x
+BenchmarkEncodeDigitsCompress1e4-4     12.99        20.59        1.59x
+BenchmarkEncodeDigitsCompress1e5-4     6.47         14.43        2.23x
+BenchmarkEncodeDigitsCompress1e6-4     5.71         13.63        2.39x
+BenchmarkEncodeTwainSpeed1e4-4         17.85        36.23        2.03x
+BenchmarkEncodeTwainSpeed1e5-4         26.73        39.90        1.49x
+BenchmarkEncodeTwainSpeed1e6-4         28.14        45.65        1.62x
+BenchmarkEncodeTwainDefault1e4-4       12.07        18.21        1.51x
+BenchmarkEncodeTwainDefault1e5-4       7.32         13.28        1.81x
+BenchmarkEncodeTwainDefault1e6-4       7.08         12.51        1.77x
+BenchmarkEncodeTwainCompress1e4-4      12.05        18.33        1.52x
+BenchmarkEncodeTwainCompress1e5-4      6.02         11.82        1.96x
+BenchmarkEncodeTwainCompress1e6-4      5.17         11.01        2.13x
+```
+* "Speed" is compression level 1
+* "Default" is compression level 6
+* "Compress" is compression level 9
+* Test files are [Digits](https://github.com/klauspost/compress/blob/master/testdata/e.txt) (no matches) and [Twain](https://github.com/klauspost/compress/blob/master/testdata/Mark.Twain-Tom.Sawyer.txt) (plain text) .
+
+As can be seen it shows a very good speedup all across the line.
+
+`Twain` is a much more realistic benchmark, and will be closer to JSON/HTML performance. Here speed is equivalent or faster, up to 2 times.
+
+**Without assembly**. This is what you can expect on systems that does not have amd64 and SSE 4:
+```
+benchmark                              old ns/op     new ns/op     delta
+BenchmarkEncodeDigitsSpeed1e4-4        554029        249558        -54.96%
+BenchmarkEncodeDigitsSpeed1e5-4        3908558       2295216       -41.28%
+BenchmarkEncodeDigitsSpeed1e6-4        37546692      22594905      -39.82%
+BenchmarkEncodeDigitsDefault1e4-4      781510        579850        -25.80%
+BenchmarkEncodeDigitsDefault1e5-4      15530248      10096561      -34.99%
+BenchmarkEncodeDigitsDefault1e6-4      174915710     111470780     -36.27%
+BenchmarkEncodeDigitsCompress1e4-4     769995        579708        -24.71%
+BenchmarkEncodeDigitsCompress1e5-4     15450113      10266373      -33.55%
+BenchmarkEncodeDigitsCompress1e6-4     175114660     110170120     -37.09%
+BenchmarkEncodeTwainSpeed1e4-4         560122        260679        -53.46%
+BenchmarkEncodeTwainSpeed1e5-4         3740978       2097372       -43.94%
+BenchmarkEncodeTwainSpeed1e6-4         35542802      20353449      -42.74%
+BenchmarkEncodeTwainDefault1e4-4       828534        646016        -22.03%
+BenchmarkEncodeTwainDefault1e5-4       13667153      10056369      -26.42%
+BenchmarkEncodeTwainDefault1e6-4       141191770     105268770     -25.44%
+BenchmarkEncodeTwainCompress1e4-4      830050        642401        -22.61%
+BenchmarkEncodeTwainCompress1e5-4      16620852      11157081      -32.87%
+BenchmarkEncodeTwainCompress1e6-4      193326820     121780770     -37.01%
+
+benchmark                              old MB/s     new MB/s     speedup
+BenchmarkEncodeDigitsSpeed1e4-4        18.05        40.07        2.22x
+BenchmarkEncodeDigitsSpeed1e5-4        25.58        43.57        1.70x
+BenchmarkEncodeDigitsSpeed1e6-4        26.63        44.26        1.66x
+BenchmarkEncodeDigitsDefault1e4-4      12.80        17.25        1.35x
+BenchmarkEncodeDigitsDefault1e5-4      6.44         9.90         1.54x
+BenchmarkEncodeDigitsDefault1e6-4      5.72         8.97         1.57x
+BenchmarkEncodeDigitsCompress1e4-4     12.99        17.25        1.33x
+BenchmarkEncodeDigitsCompress1e5-4     6.47         9.74         1.51x
+BenchmarkEncodeDigitsCompress1e6-4     5.71         9.08         1.59x
+BenchmarkEncodeTwainSpeed1e4-4         17.85        38.36        2.15x
+BenchmarkEncodeTwainSpeed1e5-4         26.73        47.68        1.78x
+BenchmarkEncodeTwainSpeed1e6-4         28.14        49.13        1.75x
+BenchmarkEncodeTwainDefault1e4-4       12.07        15.48        1.28x
+BenchmarkEncodeTwainDefault1e5-4       7.32         9.94         1.36x
+BenchmarkEncodeTwainDefault1e6-4       7.08         9.50         1.34x
+BenchmarkEncodeTwainCompress1e4-4      12.05        15.57        1.29x
+BenchmarkEncodeTwainCompress1e5-4      6.02         8.96         1.49x
+BenchmarkEncodeTwainCompress1e6-4      5.17         8.21         1.59x
+```
+So even without the assembly optimizations there is a general speedup across the board.
+
+## level 1-3 "snappy" compression
+
+Levels 1 "Best Speed", 2 and 3 are completely replaced by a converted version of the algorithm found in Snappy, modified to be fully
+compatible with the deflate bitstream (and thus still compatible with all existing zlib/gzip libraries and tools).
+This version is considerably faster than the "old" deflate at level 1. It does however come at a compression loss, usually in the order of 3-4% compared to the old level 1. However, the speed is usually 1.75 times that of the fastest deflate mode.
+
+In my previous experiments the most common case for "level 1" was that it provided no significant speedup, only lower compression compared to level 2 and sometimes even 3. However, the modified Snappy algorithm provides a very good sweet spot. Usually about 75% faster and with only little compression loss. Therefore I decided to *replace* level 1 with this mode entirely.
+
+Input is split into blocks of 64kb of, and they are encoded independently (no backreferences across blocks) for the best speed. Contrary to Snappy the output is entropy-encoded, so you will almost always see better compression than Snappy. But Snappy is still about twice as fast as Snappy in deflate mode.
+
+Level 2 and 3 have also been replaced. Level 2 is capable is matching between blocks and level 3 checks up to two hashes for matches it will try.
+
+## compression levels
+
+This table shows the compression at each level, and the percentage of the output size compared to output
+at the similar level with the standard library. Compression data is `Twain`, see above.
+
+(Not up-to-date after rebalancing)
+
+| Level | Bytes  | % size |
+|-------|--------|--------|
+| 1     | 194622 | 103.7% |
+| 2     | 174684 | 96.85% |
+| 3     | 170301 | 98.45% |
+| 4     | 165253 | 97.69% |
+| 5     | 161274 | 98.65% |
+| 6     | 160464 | 99.71% |
+| 7     | 160304 | 99.87% |
+| 8     | 160279 | 99.99% |
+| 9     | 160279 | 99.99% |
+
+To interpret and example, this version of deflate compresses input of 407287 bytes to 161274 bytes at level 5, which is 98.6% of the size of what the standard library produces; 161274 bytes.
+
+This means that from level 4 you can expect a compression level increase of a few percent. Level 1 is about 3% worse, as descibed above.
+
+# linear time compression (huffman only)
+
+This compression library adds a special compression level, named `ConstantCompression`, which allows near linear time compression. This is done by completely disabling matching of previous data, and only reduce the number of bits to represent each character. 
+
+This means that often used characters, like 'e' and ' ' (space) in text use the fewest bits to represent, and rare characters like '¤' takes more bits to represent. For more information see [wikipedia](https://en.wikipedia.org/wiki/Huffman_coding) or this nice [video](https://youtu.be/ZdooBTdW5bM).
+
+Since this type of compression has much less variance, the compression speed is mostly unaffected by the input data, and is usually more than *180MB/s* for a single core.
+
+The downside is that the compression ratio is usually considerably worse than even the fastest conventional compression. The compression raio can never be better than 8:1 (12.5%). 
+
+The linear time compression can be used as a "better than nothing" mode, where you cannot risk the encoder to slow down on some content. For comparison, the size of the "Twain" text is *233460 bytes* (+29% vs. level 1) and encode speed is 144MB/s (4.5x level 1). So in this case you trade a 30% size increase for a 4 times speedup.
+
+For more information see my blog post on [Fast Linear Time Compression](http://blog.klauspost.com/constant-time-gzipzip-compression/).
+
+This is implemented on Go 1.7 as "Huffman Only" mode, though not exposed for gzip.
+
+
+# gzip/zip optimizations
+ * Uses the faster deflate
+ * Uses SSE 4.2 CRC32 calculations.
+
+Speed increase is up to 3x of the standard library, but usually around 2x. 
+
+This is close to a real world benchmark as you will get. A 2.3MB JSON file. (NOTE: not up-to-date)
+
+```
+benchmark             old ns/op     new ns/op     delta
+BenchmarkGzipL1-4     95212470      59938275      -37.05%
+BenchmarkGzipL2-4     102069730     76349195      -25.20%
+BenchmarkGzipL3-4     115472770     82492215      -28.56%
+BenchmarkGzipL4-4     153197780     107570890     -29.78%
+BenchmarkGzipL5-4     203930260     134387930     -34.10%
+BenchmarkGzipL6-4     233172100     145495400     -37.60%
+BenchmarkGzipL7-4     297190260     197926950     -33.40%
+BenchmarkGzipL8-4     512819750     376244733     -26.63%
+BenchmarkGzipL9-4     563366800     403266833     -28.42%
+
+benchmark             old MB/s     new MB/s     speedup
+BenchmarkGzipL1-4     52.11        82.78        1.59x
+BenchmarkGzipL2-4     48.61        64.99        1.34x
+BenchmarkGzipL3-4     42.97        60.15        1.40x
+BenchmarkGzipL4-4     32.39        46.13        1.42x
+BenchmarkGzipL5-4     24.33        36.92        1.52x
+BenchmarkGzipL6-4     21.28        34.10        1.60x
+BenchmarkGzipL7-4     16.70        25.07        1.50x
+BenchmarkGzipL8-4     9.68         13.19        1.36x
+BenchmarkGzipL9-4     8.81         12.30        1.40x
+```
+
+Multithreaded compression using [pgzip](https://github.com/klauspost/pgzip) comparison, Quadcore, CPU = 8:
+
+(Not updated, old numbers)
+
+```
+benchmark           old ns/op     new ns/op     delta
+BenchmarkGzipL1     96155500      25981486      -72.98%
+BenchmarkGzipL2     101905830     24601408      -75.86%
+BenchmarkGzipL3     113506490     26321506      -76.81%
+BenchmarkGzipL4     143708220     31761818      -77.90%
+BenchmarkGzipL5     188210770     39602266      -78.96%
+BenchmarkGzipL6     209812000     40402313      -80.74%
+BenchmarkGzipL7     270015440     56103210      -79.22%
+BenchmarkGzipL8     461359700     91255220      -80.22%
+BenchmarkGzipL9     498361833     88755075      -82.19%
+
+benchmark           old MB/s     new MB/s     speedup
+BenchmarkGzipL1     51.60        190.97       3.70x
+BenchmarkGzipL2     48.69        201.69       4.14x
+BenchmarkGzipL3     43.71        188.51       4.31x
+BenchmarkGzipL4     34.53        156.22       4.52x
+BenchmarkGzipL5     26.36        125.29       4.75x
+BenchmarkGzipL6     23.65        122.81       5.19x
+BenchmarkGzipL7     18.38        88.44        4.81x
+BenchmarkGzipL8     10.75        54.37        5.06x
+BenchmarkGzipL9     9.96         55.90        5.61x
+```
+
+# snappy package
+
+The standard snappy package has now been improved. This repo contains a copy of the snappy repo.
+
+I would advise to use the standard package: https://github.com/golang/snappy
+
+
+# license
+
+This code is licensed under the same conditions as the original Go code. See LICENSE file.
--- a/vendor/github.com/klauspost/compress/flate/asm_test.go
+++ b/vendor/github.com/klauspost/compress/flate/asm_test.go
@@ -0,0 +1,193 @@
+// Copyright 2015, Klaus Post, see LICENSE for details.
+
+//+build amd64
+
+package flate
+
+import (
+	"math/rand"
+	"testing"
+)
+
+func TestCRC(t *testing.T) {
+	if !useSSE42 {
+		t.Skip("Skipping CRC test, no SSE 4.2 available")
+	}
+	for _, x := range deflateTests {
+		y := x.out
+		if len(y) >= minMatchLength {
+			t.Logf("In: %v, Out:0x%08x", y[0:minMatchLength], crc32sse(y[0:minMatchLength]))
+		}
+	}
+}
+
+func TestCRCBulk(t *testing.T) {
+	if !useSSE42 {
+		t.Skip("Skipping CRC test, no SSE 4.2 available")
+	}
+	for _, x := range deflateTests {
+		y := x.out
+		y = append(y, y...)
+		y = append(y, y...)
+		y = append(y, y...)
+		y = append(y, y...)
+		y = append(y, y...)
+		y = append(y, y...)
+		if !testing.Short() {
+			y = append(y, y...)
+			y = append(y, y...)
+		}
+		y = append(y, 1)
+		if len(y) >= minMatchLength {
+			for j := len(y) - 1; j >= 4; j-- {
+
+				// Create copy, so we easier detect of-of-bound reads
+				test := make([]byte, j)
+				test2 := make([]byte, j)
+				copy(test, y[:j])
+				copy(test2, y[:j])
+
+				// We allocate one more than we need to test for unintentional overwrites
+				dst := make([]uint32, j-3+1)
+				ref := make([]uint32, j-3+1)
+				for i := range dst {
+					dst[i] = uint32(i + 100)
+					ref[i] = uint32(i + 101)
+				}
+				// Last entry must NOT be overwritten.
+				dst[j-3] = 0x1234
+				ref[j-3] = 0x1234
+
+				// Do two encodes we can compare
+				crc32sseAll(test, dst)
+				crc32sseAll(test2, ref)
+
+				// Check all values
+				for i, got := range dst {
+					if i == j-3 {
+						if dst[i] != 0x1234 {
+							t.Fatalf("end of expected dst overwritten, was %08x", uint32(dst[i]))
+						}
+						continue
+					}
+					expect := crc32sse(y[i : i+4])
+					if got != expect && got == uint32(i)+100 {
+						t.Errorf("Len:%d Index:%d, expected 0x%08x but not modified", len(y), i, uint32(expect))
+					} else if got != expect {
+						t.Errorf("Len:%d Index:%d, got 0x%08x expected:0x%08x", len(y), i, uint32(got), uint32(expect))
+					}
+					expect = ref[i]
+					if got != expect {
+						t.Errorf("Len:%d Index:%d, got 0x%08x expected:0x%08x", len(y), i, got, expect)
+					}
+				}
+			}
+		}
+	}
+}
+
+func TestMatchLen(t *testing.T) {
+	if !useSSE42 {
+		t.Skip("Skipping Matchlen test, no SSE 4.2 available")
+	}
+	// Maximum length tested
+	var maxLen = 512
+
+	// Skips per iteration
+	is, js, ks := 3, 2, 1
+	if testing.Short() {
+		is, js, ks = 7, 5, 3
+	}
+
+	a := make([]byte, maxLen)
+	b := make([]byte, maxLen)
+	bb := make([]byte, maxLen)
+	rand.Seed(1)
+	for i := range a {
+		a[i] = byte(rand.Int63())
+		b[i] = byte(rand.Int63())
+	}
+
+	// Test different lengths
+	for i := 0; i < maxLen; i += is {
+		// Test different dst offsets.
+		for j := 0; j < maxLen-1; j += js {
+			copy(bb, b)
+			// Test different src offsets
+			for k := i - 1; k >= 0; k -= ks {
+				copy(bb[j:], a[k:i])
+				maxTest := maxLen - j
+				if maxTest > maxLen-k {
+					maxTest = maxLen - k
+				}
+				got := matchLenSSE4(a[k:], bb[j:], maxTest)
+				expect := matchLenReference(a[k:], bb[j:], maxTest)
+				if got > maxTest || got < 0 {
+					t.Fatalf("unexpected result %d (len:%d, src offset: %d, dst offset:%d)", got, maxTest, k, j)
+				}
+				if got != expect {
+					t.Fatalf("Mismatch, expected %d, got %d", expect, got)
+				}
+			}
+		}
+	}
+}
+
+// matchLenReference is a reference matcher.
+func matchLenReference(a, b []byte, max int) int {
+	for i := 0; i < max; i++ {
+		if a[i] != b[i] {
+			return i
+		}
+	}
+	return max
+}
+
+func TestHistogram(t *testing.T) {
+	if !useSSE42 {
+		t.Skip("Skipping Matchlen test, no SSE 4.2 available")
+	}
+	// Maximum length tested
+	const maxLen = 65536
+	var maxOff = 8
+
+	// Skips per iteration
+	is, js := 5, 3
+	if testing.Short() {
+		is, js = 9, 1
+		maxOff = 1
+	}
+
+	a := make([]byte, maxLen+maxOff)
+	rand.Seed(1)
+	for i := range a {
+		a[i] = byte(rand.Int63())
+	}
+
+	// Test different lengths
+	for i := 0; i <= maxLen; i += is {
+		// Test different offsets
+		for j := 0; j < maxOff; j += js {
+			var got [256]int32
+			var reference [256]int32
+
+			histogram(a[j:i+j], got[:])
+			histogramReference(a[j:i+j], reference[:])
+			for k := range got {
+				if got[k] != reference[k] {
+					t.Fatalf("mismatch at len:%d, offset:%d, value %d: (got) %d != %d (expected)", i, j, k, got[k], reference[k])
+				}
+			}
+		}
+	}
+}
+
+// histogramReference is a reference
+func histogramReference(b []byte, h []int32) {
+	if len(h) < 256 {
+		panic("Histogram too small")
+	}
+	for _, t := range b {
+		h[t]++
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/copy.go
+++ b/vendor/github.com/klauspost/compress/flate/copy.go
@@ -0,0 +1,32 @@
+// Copyright 2012 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+// forwardCopy is like the built-in copy function except that it always goes
+// forward from the start, even if the dst and src overlap.
+// It is equivalent to:
+//   for i := 0; i < n; i++ {
+//     mem[dst+i] = mem[src+i]
+//   }
+func forwardCopy(mem []byte, dst, src, n int) {
+	if dst <= src {
+		copy(mem[dst:dst+n], mem[src:src+n])
+		return
+	}
+	for {
+		if dst >= src+n {
+			copy(mem[dst:dst+n], mem[src:src+n])
+			return
+		}
+		// There is some forward overlap.  The destination
+		// will be filled with a repeated pattern of mem[src:src+k].
+		// We copy one instance of the pattern here, then repeat.
+		// Each time around this loop k will double.
+		k := dst - src
+		copy(mem[dst:dst+k], mem[src:src+k])
+		n -= k
+		dst += k
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/copy_test.go
+++ b/vendor/github.com/klauspost/compress/flate/copy_test.go
@@ -0,0 +1,54 @@
+// Copyright 2012 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import (
+	"testing"
+)
+
+func TestForwardCopy(t *testing.T) {
+	testCases := []struct {
+		dst0, dst1 int
+		src0, src1 int
+		want       string
+	}{
+		{0, 9, 0, 9, "012345678"},
+		{0, 5, 4, 9, "45678"},
+		{4, 9, 0, 5, "01230"},
+		{1, 6, 3, 8, "34567"},
+		{3, 8, 1, 6, "12121"},
+		{0, 9, 3, 6, "345"},
+		{3, 6, 0, 9, "012"},
+		{1, 6, 0, 9, "00000"},
+		{0, 4, 7, 8, "7"},
+		{0, 1, 6, 8, "6"},
+		{4, 4, 6, 9, ""},
+		{2, 8, 6, 6, ""},
+		{0, 0, 0, 0, ""},
+	}
+	for _, tc := range testCases {
+		b := []byte("0123456789")
+		n := tc.dst1 - tc.dst0
+		if tc.src1-tc.src0 < n {
+			n = tc.src1 - tc.src0
+		}
+		forwardCopy(b, tc.dst0, tc.src0, n)
+		got := string(b[tc.dst0 : tc.dst0+n])
+		if got != tc.want {
+			t.Errorf("dst=b[%d:%d], src=b[%d:%d]: got %q, want %q",
+				tc.dst0, tc.dst1, tc.src0, tc.src1, got, tc.want)
+		}
+		// Check that the bytes outside of dst[:n] were not modified.
+		for i, x := range b {
+			if i >= tc.dst0 && i < tc.dst0+n {
+				continue
+			}
+			if int(x) != '0'+i {
+				t.Errorf("dst=b[%d:%d], src=b[%d:%d]: copy overrun at b[%d]: got '%c', want '%c'",
+					tc.dst0, tc.dst1, tc.src0, tc.src1, i, x, '0'+i)
+			}
+		}
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/crc32_amd64.go
+++ b/vendor/github.com/klauspost/compress/flate/crc32_amd64.go
@@ -0,0 +1,41 @@
+//+build !noasm
+//+build !appengine
+
+// Copyright 2015, Klaus Post, see LICENSE for details.
+
+package flate
+
+import (
+	"github.com/klauspost/cpuid"
+)
+
+// crc32sse returns a hash for the first 4 bytes of the slice
+// len(a) must be >= 4.
+//go:noescape
+func crc32sse(a []byte) uint32
+
+// crc32sseAll calculates hashes for each 4-byte set in a.
+// dst must be east len(a) - 4 in size.
+// The size is not checked by the assembly.
+//go:noescape
+func crc32sseAll(a []byte, dst []uint32)
+
+// matchLenSSE4 returns the number of matching bytes in a and b
+// up to length 'max'. Both slices must be at least 'max'
+// bytes in size.
+//
+// TODO: drop the "SSE4" name, since it doesn't use any SSE instructions.
+//
+//go:noescape
+func matchLenSSE4(a, b []byte, max int) int
+
+// histogram accumulates a histogram of b in h.
+// h must be at least 256 entries in length,
+// and must be cleared before calling this function.
+//go:noescape
+func histogram(b []byte, h []int32)
+
+// Detect SSE 4.2 feature.
+func init() {
+	useSSE42 = cpuid.CPU.SSE42()
+}
--- a/vendor/github.com/klauspost/compress/flate/crc32_amd64.s
+++ b/vendor/github.com/klauspost/compress/flate/crc32_amd64.s
@@ -0,0 +1,213 @@
+//+build !noasm
+//+build !appengine
+
+// Copyright 2015, Klaus Post, see LICENSE for details.
+
+// func crc32sse(a []byte) uint32
+TEXT ·crc32sse(SB), 4, $0
+	MOVQ a+0(FP), R10
+	XORQ BX, BX
+
+	// CRC32   dword (R10), EBX
+	BYTE $0xF2; BYTE $0x41; BYTE $0x0f
+	BYTE $0x38; BYTE $0xf1; BYTE $0x1a
+
+	MOVL BX, ret+24(FP)
+	RET
+
+// func crc32sseAll(a []byte, dst []uint32)
+TEXT ·crc32sseAll(SB), 4, $0
+	MOVQ  a+0(FP), R8      // R8: src
+	MOVQ  a_len+8(FP), R10 // input length
+	MOVQ  dst+24(FP), R9   // R9: dst
+	SUBQ  $4, R10
+	JS    end
+	JZ    one_crc
+	MOVQ  R10, R13
+	SHRQ  $2, R10          // len/4
+	ANDQ  $3, R13          // len&3
+	XORQ  BX, BX
+	ADDQ  $1, R13
+	TESTQ R10, R10
+	JZ    rem_loop
+
+crc_loop:
+	MOVQ (R8), R11
+	XORQ BX, BX
+	XORQ DX, DX
+	XORQ DI, DI
+	MOVQ R11, R12
+	SHRQ $8, R11
+	MOVQ R12, AX
+	MOVQ R11, CX
+	SHRQ $16, R12
+	SHRQ $16, R11
+	MOVQ R12, SI
+
+	// CRC32   EAX, EBX
+	BYTE $0xF2; BYTE $0x0f
+	BYTE $0x38; BYTE $0xf1; BYTE $0xd8
+
+	// CRC32   ECX, EDX
+	BYTE $0xF2; BYTE $0x0f
+	BYTE $0x38; BYTE $0xf1; BYTE $0xd1
+
+	// CRC32   ESI, EDI
+	BYTE $0xF2; BYTE $0x0f
+	BYTE $0x38; BYTE $0xf1; BYTE $0xfe
+	MOVL BX, (R9)
+	MOVL DX, 4(R9)
+	MOVL DI, 8(R9)
+
+	XORQ BX, BX
+	MOVL R11, AX
+
+	// CRC32   EAX, EBX
+	BYTE $0xF2; BYTE $0x0f
+	BYTE $0x38; BYTE $0xf1; BYTE $0xd8
+	MOVL BX, 12(R9)
+
+	ADDQ $16, R9
+	ADDQ $4, R8
+	XORQ BX, BX
+	SUBQ $1, R10
+	JNZ  crc_loop
+
+rem_loop:
+	MOVL (R8), AX
+
+	// CRC32   EAX, EBX
+	BYTE $0xF2; BYTE $0x0f
+	BYTE $0x38; BYTE $0xf1; BYTE $0xd8
+
+	MOVL BX, (R9)
+	ADDQ $4, R9
+	ADDQ $1, R8
+	XORQ BX, BX
+	SUBQ $1, R13
+	JNZ  rem_loop
+
+end:
+	RET
+
+one_crc:
+	MOVQ $1, R13
+	XORQ BX, BX
+	JMP  rem_loop
+
+// func matchLenSSE4(a, b []byte, max int) int
+TEXT ·matchLenSSE4(SB), 4, $0
+	MOVQ a_base+0(FP), SI
+	MOVQ b_base+24(FP), DI
+	MOVQ DI, DX
+	MOVQ max+48(FP), CX
+
+cmp8:
+	// As long as we are 8 or more bytes before the end of max, we can load and
+	// compare 8 bytes at a time. If those 8 bytes are equal, repeat.
+	CMPQ CX, $8
+	JLT  cmp1
+	MOVQ (SI), AX
+	MOVQ (DI), BX
+	CMPQ AX, BX
+	JNE  bsf
+	ADDQ $8, SI
+	ADDQ $8, DI
+	SUBQ $8, CX
+	JMP  cmp8
+
+bsf:
+	// If those 8 bytes were not equal, XOR the two 8 byte values, and return
+	// the index of the first byte that differs. The BSF instruction finds the
+	// least significant 1 bit, the amd64 architecture is little-endian, and
+	// the shift by 3 converts a bit index to a byte index.
+	XORQ AX, BX
+	BSFQ BX, BX
+	SHRQ $3, BX
+	ADDQ BX, DI
+
+	// Subtract off &b[0] to convert from &b[ret] to ret, and return.
+	SUBQ DX, DI
+	MOVQ DI, ret+56(FP)
+	RET
+
+cmp1:
+	// In the slices' tail, compare 1 byte at a time.
+	CMPQ CX, $0
+	JEQ  matchLenEnd
+	MOVB (SI), AX
+	MOVB (DI), BX
+	CMPB AX, BX
+	JNE  matchLenEnd
+	ADDQ $1, SI
+	ADDQ $1, DI
+	SUBQ $1, CX
+	JMP  cmp1
+
+matchLenEnd:
+	// Subtract off &b[0] to convert from &b[ret] to ret, and return.
+	SUBQ DX, DI
+	MOVQ DI, ret+56(FP)
+	RET
+
+// func histogram(b []byte, h []int32)
+TEXT ·histogram(SB), 4, $0
+	MOVQ b+0(FP), SI     // SI: &b
+	MOVQ b_len+8(FP), R9 // R9: len(b)
+	MOVQ h+24(FP), DI    // DI: Histogram
+	MOVQ R9, R8
+	SHRQ $3, R8
+	JZ   hist1
+	XORQ R11, R11
+
+loop_hist8:
+	MOVQ (SI), R10
+
+	MOVB R10, R11
+	INCL (DI)(R11*4)
+	SHRQ $8, R10
+
+	MOVB R10, R11
+	INCL (DI)(R11*4)
+	SHRQ $8, R10
+
+	MOVB R10, R11
+	INCL (DI)(R11*4)
+	SHRQ $8, R10
+
+	MOVB R10, R11
+	INCL (DI)(R11*4)
+	SHRQ $8, R10
+
+	MOVB R10, R11
+	INCL (DI)(R11*4)
+	SHRQ $8, R10
+
+	MOVB R10, R11
+	INCL (DI)(R11*4)
+	SHRQ $8, R10
+
+	MOVB R10, R11
+	INCL (DI)(R11*4)
+	SHRQ $8, R10
+
+	INCL (DI)(R10*4)
+
+	ADDQ $8, SI
+	DECQ R8
+	JNZ  loop_hist8
+
+hist1:
+	ANDQ $7, R9
+	JZ   end_hist
+	XORQ R10, R10
+
+loop_hist1:
+	MOVB (SI), R10
+	INCL (DI)(R10*4)
+	INCQ SI
+	DECQ R9
+	JNZ  loop_hist1
+
+end_hist:
+	RET
--- a/vendor/github.com/klauspost/compress/flate/crc32_noasm.go
+++ b/vendor/github.com/klauspost/compress/flate/crc32_noasm.go
@@ -0,0 +1,35 @@
+//+build !amd64 noasm appengine
+
+// Copyright 2015, Klaus Post, see LICENSE for details.
+
+package flate
+
+func init() {
+	useSSE42 = false
+}
+
+// crc32sse should never be called.
+func crc32sse(a []byte) uint32 {
+	panic("no assembler")
+}
+
+// crc32sseAll should never be called.
+func crc32sseAll(a []byte, dst []uint32) {
+	panic("no assembler")
+}
+
+// matchLenSSE4 should never be called.
+func matchLenSSE4(a, b []byte, max int) int {
+	panic("no assembler")
+	return 0
+}
+
+// histogram accumulates a histogram of b in h.
+//
+// len(h) must be >= 256, and h's elements must be all zeroes.
+func histogram(b []byte, h []int32) {
+	h = h[:256]
+	for _, t := range b {
+		h[t]++
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/deflate.go
+++ b/vendor/github.com/klauspost/compress/flate/deflate.go
--- a/vendor/github.com/klauspost/compress/flate/deflate_test.go
+++ b/vendor/github.com/klauspost/compress/flate/deflate_test.go
@@ -0,0 +1,648 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Copyright (c) 2015 Klaus Post
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import (
+	"bytes"
+	"fmt"
+	"io"
+	"io/ioutil"
+	"reflect"
+	"strings"
+	"sync"
+	"testing"
+)
+
+type deflateTest struct {
+	in    []byte
+	level int
+	out   []byte
+}
+
+type deflateInflateTest struct {
+	in []byte
+}
+
+type reverseBitsTest struct {
+	in       uint16
+	bitCount uint8
+	out      uint16
+}
+
+var deflateTests = []*deflateTest{
+	{[]byte{}, 0, []byte{1, 0, 0, 255, 255}},
+	{[]byte{0x11}, BestCompression, []byte{18, 4, 4, 0, 0, 255, 255}},
+	{[]byte{0x11}, BestCompression, []byte{18, 4, 4, 0, 0, 255, 255}},
+	{[]byte{0x11}, BestCompression, []byte{18, 4, 4, 0, 0, 255, 255}},
+
+	{[]byte{0x11}, 0, []byte{0, 1, 0, 254, 255, 17, 1, 0, 0, 255, 255}},
+	{[]byte{0x11, 0x12}, 0, []byte{0, 2, 0, 253, 255, 17, 18, 1, 0, 0, 255, 255}},
+	{[]byte{0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11}, 0,
+		[]byte{0, 8, 0, 247, 255, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0, 0, 255, 255},
+	},
+	{[]byte{}, 1, []byte{1, 0, 0, 255, 255}},
+	{[]byte{0x11}, BestCompression, []byte{18, 4, 4, 0, 0, 255, 255}},
+	{[]byte{0x11, 0x12}, BestCompression, []byte{18, 20, 2, 4, 0, 0, 255, 255}},
+	{[]byte{0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11}, BestCompression, []byte{18, 132, 2, 64, 0, 0, 0, 255, 255}},
+	{[]byte{}, 9, []byte{1, 0, 0, 255, 255}},
+	{[]byte{0x11}, 9, []byte{18, 4, 4, 0, 0, 255, 255}},
+	{[]byte{0x11, 0x12}, 9, []byte{18, 20, 2, 4, 0, 0, 255, 255}},
+	{[]byte{0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11}, 9, []byte{18, 132, 2, 64, 0, 0, 0, 255, 255}},
+}
+
+var deflateInflateTests = []*deflateInflateTest{
+	{[]byte{}},
+	{[]byte{0x11}},
+	{[]byte{0x11, 0x12}},
+	{[]byte{0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11}},
+	{[]byte{0x11, 0x10, 0x13, 0x41, 0x21, 0x21, 0x41, 0x13, 0x87, 0x78, 0x13}},
+	{largeDataChunk()},
+}
+
+var reverseBitsTests = []*reverseBitsTest{
+	{1, 1, 1},
+	{1, 2, 2},
+	{1, 3, 4},
+	{1, 4, 8},
+	{1, 5, 16},
+	{17, 5, 17},
+	{257, 9, 257},
+	{29, 5, 23},
+}
+
+func largeDataChunk() []byte {
+	result := make([]byte, 100000)
+	for i := range result {
+		result[i] = byte(i * i & 0xFF)
+	}
+	return result
+}
+
+func TestCRCBulkOld(t *testing.T) {
+	for _, x := range deflateTests {
+		y := x.out
+		if len(y) >= minMatchLength {
+			y = append(y, y...)
+			for j := 4; j < len(y); j++ {
+				y := y[:j]
+				dst := make([]uint32, len(y)-minMatchLength+1)
+				for i := range dst {
+					dst[i] = uint32(i + 100)
+				}
+				bulkHash4(y, dst)
+				for i, val := range dst {
+					got := val
+					expect := hash4(y[i:])
+					if got != expect && got == uint32(i)+100 {
+						t.Errorf("Len:%d Index:%d, expected 0x%08x but not modified", len(y), i, expect)
+					} else if got != expect {
+						t.Errorf("Len:%d Index:%d, got 0x%08x expected:0x%08x", len(y), i, got, expect)
+					} else {
+						//t.Logf("Len:%d Index:%d OK (0x%08x)", len(y), i, got)
+					}
+				}
+			}
+		}
+	}
+}
+
+func TestDeflate(t *testing.T) {
+	for _, h := range deflateTests {
+		var buf bytes.Buffer
+		w, err := NewWriter(&buf, h.level)
+		if err != nil {
+			t.Errorf("NewWriter: %v", err)
+			continue
+		}
+		w.Write(h.in)
+		w.Close()
+		if !bytes.Equal(buf.Bytes(), h.out) {
+			t.Errorf("Deflate(%d, %x) = \n%#v, want \n%#v", h.level, h.in, buf.Bytes(), h.out)
+		}
+	}
+}
+
+// A sparseReader returns a stream consisting of 0s followed by 1<<16 1s.
+// This tests missing hash references in a very large input.
+type sparseReader struct {
+	l   int64
+	cur int64
+}
+
+func (r *sparseReader) Read(b []byte) (n int, err error) {
+	if r.cur >= r.l {
+		return 0, io.EOF
+	}
+	n = len(b)
+	cur := r.cur + int64(n)
+	if cur > r.l {
+		n -= int(cur - r.l)
+		cur = r.l
+	}
+	for i := range b[0:n] {
+		if r.cur+int64(i) >= r.l-1<<16 {
+			b[i] = 1
+		} else {
+			b[i] = 0
+		}
+	}
+	r.cur = cur
+	return
+}
+
+func TestVeryLongSparseChunk(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping sparse chunk during short test")
+	}
+	w, err := NewWriter(ioutil.Discard, 1)
+	if err != nil {
+		t.Errorf("NewWriter: %v", err)
+		return
+	}
+	if _, err = io.Copy(w, &sparseReader{l: 23E8}); err != nil {
+		t.Errorf("Compress failed: %v", err)
+		return
+	}
+}
+
+type syncBuffer struct {
+	buf    bytes.Buffer
+	mu     sync.RWMutex
+	closed bool
+	ready  chan bool
+}
+
+func newSyncBuffer() *syncBuffer {
+	return &syncBuffer{ready: make(chan bool, 1)}
+}
+
+func (b *syncBuffer) Read(p []byte) (n int, err error) {
+	for {
+		b.mu.RLock()
+		n, err = b.buf.Read(p)
+		b.mu.RUnlock()
+		if n > 0 || b.closed {
+			return
+		}
+		<-b.ready
+	}
+}
+
+func (b *syncBuffer) signal() {
+	select {
+	case b.ready <- true:
+	default:
+	}
+}
+
+func (b *syncBuffer) Write(p []byte) (n int, err error) {
+	n, err = b.buf.Write(p)
+	b.signal()
+	return
+}
+
+func (b *syncBuffer) WriteMode() {
+	b.mu.Lock()
+}
+
+func (b *syncBuffer) ReadMode() {
+	b.mu.Unlock()
+	b.signal()
+}
+
+func (b *syncBuffer) Close() error {
+	b.closed = true
+	b.signal()
+	return nil
+}
+
+func testSync(t *testing.T, level int, input []byte, name string) {
+	if len(input) == 0 {
+		return
+	}
+
+	t.Logf("--testSync %d, %d, %s", level, len(input), name)
+	buf := newSyncBuffer()
+	buf1 := new(bytes.Buffer)
+	buf.WriteMode()
+	w, err := NewWriter(io.MultiWriter(buf, buf1), level)
+	if err != nil {
+		t.Errorf("NewWriter: %v", err)
+		return
+	}
+	r := NewReader(buf)
+
+	// Write half the input and read back.
+	for i := 0; i < 2; i++ {
+		var lo, hi int
+		if i == 0 {
+			lo, hi = 0, (len(input)+1)/2
+		} else {
+			lo, hi = (len(input)+1)/2, len(input)
+		}
+		t.Logf("#%d: write %d-%d", i, lo, hi)
+		if _, err := w.Write(input[lo:hi]); err != nil {
+			t.Errorf("testSync: write: %v", err)
+			return
+		}
+		if i == 0 {
+			if err := w.Flush(); err != nil {
+				t.Errorf("testSync: flush: %v", err)
+				return
+			}
+		} else {
+			if err := w.Close(); err != nil {
+				t.Errorf("testSync: close: %v", err)
+			}
+		}
+		buf.ReadMode()
+		out := make([]byte, hi-lo+1)
+		m, err := io.ReadAtLeast(r, out, hi-lo)
+		t.Logf("#%d: read %d", i, m)
+		if m != hi-lo || err != nil {
+			t.Errorf("testSync/%d (%d, %d, %s): read %d: %d, %v (%d left)", i, level, len(input), name, hi-lo, m, err, buf.buf.Len())
+			return
+		}
+		if !bytes.Equal(input[lo:hi], out[:hi-lo]) {
+			t.Errorf("testSync/%d: read wrong bytes: %x vs %x", i, input[lo:hi], out[:hi-lo])
+			return
+		}
+		// This test originally checked that after reading
+		// the first half of the input, there was nothing left
+		// in the read buffer (buf.buf.Len() != 0) but that is
+		// not necessarily the case: the write Flush may emit
+		// some extra framing bits that are not necessary
+		// to process to obtain the first half of the uncompressed
+		// data.  The test ran correctly most of the time, because
+		// the background goroutine had usually read even
+		// those extra bits by now, but it's not a useful thing to
+		// check.
+		buf.WriteMode()
+	}
+	buf.ReadMode()
+	out := make([]byte, 10)
+	if n, err := r.Read(out); n > 0 || err != io.EOF {
+		t.Errorf("testSync (%d, %d, %s): final Read: %d, %v (hex: %x)", level, len(input), name, n, err, out[0:n])
+	}
+	if buf.buf.Len() != 0 {
+		t.Errorf("testSync (%d, %d, %s): extra data at end", level, len(input), name)
+	}
+	r.Close()
+
+	// stream should work for ordinary reader too
+	r = NewReader(buf1)
+	out, err = ioutil.ReadAll(r)
+	if err != nil {
+		t.Errorf("testSync: read: %s", err)
+		return
+	}
+	r.Close()
+	if !bytes.Equal(input, out) {
+		t.Errorf("testSync: decompress(compress(data)) != data: level=%d input=%s", level, name)
+	}
+}
+
+func testToFromWithLevelAndLimit(t *testing.T, level int, input []byte, name string, limit int) {
+	var buffer bytes.Buffer
+	w, err := NewWriter(&buffer, level)
+	if err != nil {
+		t.Errorf("NewWriter: %v", err)
+		return
+	}
+	w.Write(input)
+	w.Close()
+	if limit > 0 && buffer.Len() > limit {
+		t.Errorf("level: %d, len(compress(data)) = %d > limit = %d", level, buffer.Len(), limit)
+		return
+	}
+	if limit > 0 {
+		t.Logf("level: %d - Size:%.2f%%, %d b\n", level, float64(buffer.Len()*100)/float64(limit), buffer.Len())
+	}
+	r := NewReader(&buffer)
+	out, err := ioutil.ReadAll(r)
+	if err != nil {
+		t.Errorf("read: %s", err)
+		return
+	}
+	r.Close()
+	if !bytes.Equal(input, out) {
+		t.Errorf("decompress(compress(data)) != data: level=%d input=%s", level, name)
+		return
+	}
+	testSync(t, level, input, name)
+}
+
+func testToFromWithLimit(t *testing.T, input []byte, name string, limit [11]int) {
+	for i := 0; i < 10; i++ {
+		testToFromWithLevelAndLimit(t, i, input, name, limit[i])
+	}
+	testToFromWithLevelAndLimit(t, -2, input, name, limit[10])
+}
+
+func TestDeflateInflate(t *testing.T) {
+	for i, h := range deflateInflateTests {
+		testToFromWithLimit(t, h.in, fmt.Sprintf("#%d", i), [11]int{})
+	}
+}
+
+func TestReverseBits(t *testing.T) {
+	for _, h := range reverseBitsTests {
+		if v := reverseBits(h.in, h.bitCount); v != h.out {
+			t.Errorf("reverseBits(%v,%v) = %v, want %v",
+				h.in, h.bitCount, v, h.out)
+		}
+	}
+}
+
+type deflateInflateStringTest struct {
+	filename string
+	label    string
+	limit    [11]int // Number 11 is ConstantCompression
+}
+
+var deflateInflateStringTests = []deflateInflateStringTest{
+	{
+		"../testdata/e.txt",
+		"2.718281828...",
+		[...]int{100018, 67900, 50960, 51150, 50930, 50790, 50790, 50790, 50790, 50790, 43683 + 100},
+	},
+	{
+		"../testdata/Mark.Twain-Tom.Sawyer.txt",
+		"Mark.Twain-Tom.Sawyer",
+		[...]int{387999, 185000, 182361, 179974, 174124, 168819, 162936, 160506, 160295, 160295, 233460 + 100},
+	},
+}
+
+func TestDeflateInflateString(t *testing.T) {
+	for _, test := range deflateInflateStringTests {
+		gold, err := ioutil.ReadFile(test.filename)
+		if err != nil {
+			t.Error(err)
+		}
+		// Remove returns that may be present on Windows
+		neutral := strings.Map(func(r rune) rune {
+			if r != '\r' {
+				return r
+			}
+			return -1
+		}, string(gold))
+
+		testToFromWithLimit(t, []byte(neutral), test.label, test.limit)
+
+		if testing.Short() {
+			break
+		}
+	}
+}
+
+func TestReaderDict(t *testing.T) {
+	const (
+		dict = "hello world"
+		text = "hello again world"
+	)
+	var b bytes.Buffer
+	w, err := NewWriter(&b, 5)
+	if err != nil {
+		t.Fatalf("NewWriter: %v", err)
+	}
+	w.Write([]byte(dict))
+	w.Flush()
+	b.Reset()
+	w.Write([]byte(text))
+	w.Close()
+
+	r := NewReaderDict(&b, []byte(dict))
+	data, err := ioutil.ReadAll(r)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if string(data) != "hello again world" {
+		t.Fatalf("read returned %q want %q", string(data), text)
+	}
+}
+
+func TestWriterDict(t *testing.T) {
+	const (
+		dict = "hello world Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua."
+		text = "hello world Lorem ipsum dolor sit amet"
+	)
+	// This test is sensitive to algorithm changes that skip
+	// data in favour of speed. Higher levels are less prone to this
+	// so we test level 4-9.
+	for l := 4; l < 9; l++ {
+		var b bytes.Buffer
+		w, err := NewWriter(&b, l)
+		if err != nil {
+			t.Fatalf("level %d, NewWriter: %v", l, err)
+		}
+		w.Write([]byte(dict))
+		w.Flush()
+		b.Reset()
+		w.Write([]byte(text))
+		w.Close()
+
+		var b1 bytes.Buffer
+		w, _ = NewWriterDict(&b1, l, []byte(dict))
+		w.Write([]byte(text))
+		w.Close()
+
+		if !bytes.Equal(b1.Bytes(), b.Bytes()) {
+			t.Errorf("level %d, writer wrote\n%v\n want\n%v", l, b1.Bytes(), b.Bytes())
+		}
+	}
+}
+
+// See http://code.google.com/p/go/issues/detail?id=2508
+func TestRegression2508(t *testing.T) {
+	if testing.Short() {
+		t.Logf("test disabled with -short")
+		return
+	}
+	w, err := NewWriter(ioutil.Discard, 1)
+	if err != nil {
+		t.Fatalf("NewWriter: %v", err)
+	}
+	buf := make([]byte, 1024)
+	for i := 0; i < 131072; i++ {
+		if _, err := w.Write(buf); err != nil {
+			t.Fatalf("writer failed: %v", err)
+		}
+	}
+	w.Close()
+}
+
+func TestWriterReset(t *testing.T) {
+	for level := -2; level <= 9; level++ {
+		if level == -1 {
+			level++
+		}
+		if testing.Short() && level > 1 {
+			break
+		}
+		w, err := NewWriter(ioutil.Discard, level)
+		if err != nil {
+			t.Fatalf("NewWriter: %v", err)
+		}
+		buf := []byte("hello world")
+		for i := 0; i < 1024; i++ {
+			w.Write(buf)
+		}
+		w.Reset(ioutil.Discard)
+
+		wref, err := NewWriter(ioutil.Discard, level)
+		if err != nil {
+			t.Fatalf("NewWriter: %v", err)
+		}
+
+		// DeepEqual doesn't compare functions.
+		w.d.fill, wref.d.fill = nil, nil
+		w.d.step, wref.d.step = nil, nil
+		w.d.bulkHasher, wref.d.bulkHasher = nil, nil
+		w.d.snap, wref.d.snap = nil, nil
+
+		// hashMatch is always overwritten when used.
+		copy(w.d.hashMatch[:], wref.d.hashMatch[:])
+		if w.d.tokens.n != 0 {
+			t.Errorf("level %d Writer not reset after Reset. %d tokens were present", level, w.d.tokens.n)
+		}
+		// As long as the length is 0, we don't care about the content.
+		w.d.tokens = wref.d.tokens
+
+		// We don't care if there are values in the window, as long as it is at d.index is 0
+		w.d.window = wref.d.window
+		if !reflect.DeepEqual(w, wref) {
+			t.Errorf("level %d Writer not reset after Reset", level)
+		}
+	}
+	testResetOutput(t, func(w io.Writer) (*Writer, error) { return NewWriter(w, NoCompression) })
+	testResetOutput(t, func(w io.Writer) (*Writer, error) { return NewWriter(w, DefaultCompression) })
+	testResetOutput(t, func(w io.Writer) (*Writer, error) { return NewWriter(w, BestCompression) })
+	testResetOutput(t, func(w io.Writer) (*Writer, error) { return NewWriter(w, ConstantCompression) })
+	dict := []byte("we are the world")
+	testResetOutput(t, func(w io.Writer) (*Writer, error) { return NewWriterDict(w, NoCompression, dict) })
+	testResetOutput(t, func(w io.Writer) (*Writer, error) { return NewWriterDict(w, DefaultCompression, dict) })
+	testResetOutput(t, func(w io.Writer) (*Writer, error) { return NewWriterDict(w, BestCompression, dict) })
+	testResetOutput(t, func(w io.Writer) (*Writer, error) { return NewWriterDict(w, ConstantCompression, dict) })
+}
+
+func testResetOutput(t *testing.T, newWriter func(w io.Writer) (*Writer, error)) {
+	buf := new(bytes.Buffer)
+	w, err := newWriter(buf)
+	if err != nil {
+		t.Fatalf("NewWriter: %v", err)
+	}
+	b := []byte("hello world")
+	for i := 0; i < 1024; i++ {
+		w.Write(b)
+	}
+	w.Close()
+	out1 := buf.Bytes()
+
+	buf2 := new(bytes.Buffer)
+	w.Reset(buf2)
+	for i := 0; i < 1024; i++ {
+		w.Write(b)
+	}
+	w.Close()
+	out2 := buf2.Bytes()
+
+	if len(out1) != len(out2) {
+		t.Errorf("got %d, expected %d bytes", len(out2), len(out1))
+	}
+	if bytes.Compare(out1, out2) != 0 {
+		mm := 0
+		for i, b := range out1[:len(out2)] {
+			if b != out2[i] {
+				t.Errorf("mismatch index %d: %02x, expected %02x", i, out2[i], b)
+			}
+			mm++
+			if mm == 10 {
+				t.Fatal("Stopping")
+			}
+		}
+	}
+	t.Logf("got %d bytes", len(out1))
+}
+
+// TestBestSpeed tests that round-tripping through deflate and then inflate
+// recovers the original input. The Write sizes are near the thresholds in the
+// compressor.encSpeed method (0, 16, 128), as well as near maxStoreBlockSize
+// (65535).
+func TestBestSpeed(t *testing.T) {
+	abc := make([]byte, 128)
+	for i := range abc {
+		abc[i] = byte(i)
+	}
+	abcabc := bytes.Repeat(abc, 131072/len(abc))
+	var want []byte
+
+	testCases := [][]int{
+		{65536, 0},
+		{65536, 1},
+		{65536, 1, 256},
+		{65536, 1, 65536},
+		{65536, 14},
+		{65536, 15},
+		{65536, 16},
+		{65536, 16, 256},
+		{65536, 16, 65536},
+		{65536, 127},
+		{65536, 128},
+		{65536, 128, 256},
+		{65536, 128, 65536},
+		{65536, 129},
+		{65536, 65536, 256},
+		{65536, 65536, 65536},
+	}
+
+	for i, tc := range testCases {
+		for _, firstN := range []int{1, 65534, 65535, 65536, 65537, 131072} {
+			tc[0] = firstN
+		outer:
+			for _, flush := range []bool{false, true} {
+				buf := new(bytes.Buffer)
+				want = want[:0]
+
+				w, err := NewWriter(buf, BestSpeed)
+				if err != nil {
+					t.Errorf("i=%d, firstN=%d, flush=%t: NewWriter: %v", i, firstN, flush, err)
+					continue
+				}
+				for _, n := range tc {
+					want = append(want, abcabc[:n]...)
+					if _, err := w.Write(abcabc[:n]); err != nil {
+						t.Errorf("i=%d, firstN=%d, flush=%t: Write: %v", i, firstN, flush, err)
+						continue outer
+					}
+					if !flush {
+						continue
+					}
+					if err := w.Flush(); err != nil {
+						t.Errorf("i=%d, firstN=%d, flush=%t: Flush: %v", i, firstN, flush, err)
+						continue outer
+					}
+				}
+				if err := w.Close(); err != nil {
+					t.Errorf("i=%d, firstN=%d, flush=%t: Close: %v", i, firstN, flush, err)
+					continue
+				}
+
+				r := NewReader(buf)
+				got, err := ioutil.ReadAll(r)
+				if err != nil {
+					t.Errorf("i=%d, firstN=%d, flush=%t: ReadAll: %v", i, firstN, flush, err)
+					continue
+				}
+				r.Close()
+
+				if !bytes.Equal(got, want) {
+					t.Errorf("i=%d, firstN=%d, flush=%t: corruption during deflate-then-inflate", i, firstN, flush)
+					continue
+				}
+			}
+		}
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/dict_decoder.go
+++ b/vendor/github.com/klauspost/compress/flate/dict_decoder.go
@@ -0,0 +1,184 @@
+// Copyright 2016 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+// dictDecoder implements the LZ77 sliding dictionary as used in decompression.
+// LZ77 decompresses data through sequences of two forms of commands:
+//
+//	* Literal insertions: Runs of one or more symbols are inserted into the data
+//	stream as is. This is accomplished through the writeByte method for a
+//	single symbol, or combinations of writeSlice/writeMark for multiple symbols.
+//	Any valid stream must start with a literal insertion if no preset dictionary
+//	is used.
+//
+//	* Backward copies: Runs of one or more symbols are copied from previously
+//	emitted data. Backward copies come as the tuple (dist, length) where dist
+//	determines how far back in the stream to copy from and length determines how
+//	many bytes to copy. Note that it is valid for the length to be greater than
+//	the distance. Since LZ77 uses forward copies, that situation is used to
+//	perform a form of run-length encoding on repeated runs of symbols.
+//	The writeCopy and tryWriteCopy are used to implement this command.
+//
+// For performance reasons, this implementation performs little to no sanity
+// checks about the arguments. As such, the invariants documented for each
+// method call must be respected.
+type dictDecoder struct {
+	hist []byte // Sliding window history
+
+	// Invariant: 0 <= rdPos <= wrPos <= len(hist)
+	wrPos int  // Current output position in buffer
+	rdPos int  // Have emitted hist[:rdPos] already
+	full  bool // Has a full window length been written yet?
+}
+
+// init initializes dictDecoder to have a sliding window dictionary of the given
+// size. If a preset dict is provided, it will initialize the dictionary with
+// the contents of dict.
+func (dd *dictDecoder) init(size int, dict []byte) {
+	*dd = dictDecoder{hist: dd.hist}
+
+	if cap(dd.hist) < size {
+		dd.hist = make([]byte, size)
+	}
+	dd.hist = dd.hist[:size]
+
+	if len(dict) > len(dd.hist) {
+		dict = dict[len(dict)-len(dd.hist):]
+	}
+	dd.wrPos = copy(dd.hist, dict)
+	if dd.wrPos == len(dd.hist) {
+		dd.wrPos = 0
+		dd.full = true
+	}
+	dd.rdPos = dd.wrPos
+}
+
+// histSize reports the total amount of historical data in the dictionary.
+func (dd *dictDecoder) histSize() int {
+	if dd.full {
+		return len(dd.hist)
+	}
+	return dd.wrPos
+}
+
+// availRead reports the number of bytes that can be flushed by readFlush.
+func (dd *dictDecoder) availRead() int {
+	return dd.wrPos - dd.rdPos
+}
+
+// availWrite reports the available amount of output buffer space.
+func (dd *dictDecoder) availWrite() int {
+	return len(dd.hist) - dd.wrPos
+}
+
+// writeSlice returns a slice of the available buffer to write data to.
+//
+// This invariant will be kept: len(s) <= availWrite()
+func (dd *dictDecoder) writeSlice() []byte {
+	return dd.hist[dd.wrPos:]
+}
+
+// writeMark advances the writer pointer by cnt.
+//
+// This invariant must be kept: 0 <= cnt <= availWrite()
+func (dd *dictDecoder) writeMark(cnt int) {
+	dd.wrPos += cnt
+}
+
+// writeByte writes a single byte to the dictionary.
+//
+// This invariant must be kept: 0 < availWrite()
+func (dd *dictDecoder) writeByte(c byte) {
+	dd.hist[dd.wrPos] = c
+	dd.wrPos++
+}
+
+// writeCopy copies a string at a given (dist, length) to the output.
+// This returns the number of bytes copied and may be less than the requested
+// length if the available space in the output buffer is too small.
+//
+// This invariant must be kept: 0 < dist <= histSize()
+func (dd *dictDecoder) writeCopy(dist, length int) int {
+	dstBase := dd.wrPos
+	dstPos := dstBase
+	srcPos := dstPos - dist
+	endPos := dstPos + length
+	if endPos > len(dd.hist) {
+		endPos = len(dd.hist)
+	}
+
+	// Copy non-overlapping section after destination position.
+	//
+	// This section is non-overlapping in that the copy length for this section
+	// is always less than or equal to the backwards distance. This can occur
+	// if a distance refers to data that wraps-around in the buffer.
+	// Thus, a backwards copy is performed here; that is, the exact bytes in
+	// the source prior to the copy is placed in the destination.
+	if srcPos < 0 {
+		srcPos += len(dd.hist)
+		dstPos += copy(dd.hist[dstPos:endPos], dd.hist[srcPos:])
+		srcPos = 0
+	}
+
+	// Copy possibly overlapping section before destination position.
+	//
+	// This section can overlap if the copy length for this section is larger
+	// than the backwards distance. This is allowed by LZ77 so that repeated
+	// strings can be succinctly represented using (dist, length) pairs.
+	// Thus, a forwards copy is performed here; that is, the bytes copied is
+	// possibly dependent on the resulting bytes in the destination as the copy
+	// progresses along. This is functionally equivalent to the following:
+	//
+	//	for i := 0; i < endPos-dstPos; i++ {
+	//		dd.hist[dstPos+i] = dd.hist[srcPos+i]
+	//	}
+	//	dstPos = endPos
+	//
+	for dstPos < endPos {
+		dstPos += copy(dd.hist[dstPos:endPos], dd.hist[srcPos:dstPos])
+	}
+
+	dd.wrPos = dstPos
+	return dstPos - dstBase
+}
+
+// tryWriteCopy tries to copy a string at a given (distance, length) to the
+// output. This specialized version is optimized for short distances.
+//
+// This method is designed to be inlined for performance reasons.
+//
+// This invariant must be kept: 0 < dist <= histSize()
+func (dd *dictDecoder) tryWriteCopy(dist, length int) int {
+	dstPos := dd.wrPos
+	endPos := dstPos + length
+	if dstPos < dist || endPos > len(dd.hist) {
+		return 0
+	}
+	dstBase := dstPos
+	srcPos := dstPos - dist
+
+	// Copy possibly overlapping section before destination position.
+loop:
+	dstPos += copy(dd.hist[dstPos:endPos], dd.hist[srcPos:dstPos])
+	if dstPos < endPos {
+		goto loop // Avoid for-loop so that this function can be inlined
+	}
+
+	dd.wrPos = dstPos
+	return dstPos - dstBase
+}
+
+// readFlush returns a slice of the historical buffer that is ready to be
+// emitted to the user. The data returned by readFlush must be fully consumed
+// before calling any other dictDecoder methods.
+func (dd *dictDecoder) readFlush() []byte {
+	toRead := dd.hist[dd.rdPos:dd.wrPos]
+	dd.rdPos = dd.wrPos
+	if dd.wrPos == len(dd.hist) {
+		dd.wrPos, dd.rdPos = 0, 0
+		dd.full = true
+	}
+	return toRead
+}
--- a/vendor/github.com/klauspost/compress/flate/dict_decoder_test.go
+++ b/vendor/github.com/klauspost/compress/flate/dict_decoder_test.go
@@ -0,0 +1,139 @@
+// Copyright 2016 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import (
+	"bytes"
+	"strings"
+	"testing"
+)
+
+func TestDictDecoder(t *testing.T) {
+	const (
+		abc  = "ABC\n"
+		fox  = "The quick brown fox jumped over the lazy dog!\n"
+		poem = "The Road Not Taken\nRobert Frost\n" +
+			"\n" +
+			"Two roads diverged in a yellow wood,\n" +
+			"And sorry I could not travel both\n" +
+			"And be one traveler, long I stood\n" +
+			"And looked down one as far as I could\n" +
+			"To where it bent in the undergrowth;\n" +
+			"\n" +
+			"Then took the other, as just as fair,\n" +
+			"And having perhaps the better claim,\n" +
+			"Because it was grassy and wanted wear;\n" +
+			"Though as for that the passing there\n" +
+			"Had worn them really about the same,\n" +
+			"\n" +
+			"And both that morning equally lay\n" +
+			"In leaves no step had trodden black.\n" +
+			"Oh, I kept the first for another day!\n" +
+			"Yet knowing how way leads on to way,\n" +
+			"I doubted if I should ever come back.\n" +
+			"\n" +
+			"I shall be telling this with a sigh\n" +
+			"Somewhere ages and ages hence:\n" +
+			"Two roads diverged in a wood, and I-\n" +
+			"I took the one less traveled by,\n" +
+			"And that has made all the difference.\n"
+	)
+
+	var poemRefs = []struct {
+		dist   int // Backward distance (0 if this is an insertion)
+		length int // Length of copy or insertion
+	}{
+		{0, 38}, {33, 3}, {0, 48}, {79, 3}, {0, 11}, {34, 5}, {0, 6}, {23, 7},
+		{0, 8}, {50, 3}, {0, 2}, {69, 3}, {34, 5}, {0, 4}, {97, 3}, {0, 4},
+		{43, 5}, {0, 6}, {7, 4}, {88, 7}, {0, 12}, {80, 3}, {0, 2}, {141, 4},
+		{0, 1}, {196, 3}, {0, 3}, {157, 3}, {0, 6}, {181, 3}, {0, 2}, {23, 3},
+		{77, 3}, {28, 5}, {128, 3}, {110, 4}, {70, 3}, {0, 4}, {85, 6}, {0, 2},
+		{182, 6}, {0, 4}, {133, 3}, {0, 7}, {47, 5}, {0, 20}, {112, 5}, {0, 1},
+		{58, 3}, {0, 8}, {59, 3}, {0, 4}, {173, 3}, {0, 5}, {114, 3}, {0, 4},
+		{92, 5}, {0, 2}, {71, 3}, {0, 2}, {76, 5}, {0, 1}, {46, 3}, {96, 4},
+		{130, 4}, {0, 3}, {360, 3}, {0, 3}, {178, 5}, {0, 7}, {75, 3}, {0, 3},
+		{45, 6}, {0, 6}, {299, 6}, {180, 3}, {70, 6}, {0, 1}, {48, 3}, {66, 4},
+		{0, 3}, {47, 5}, {0, 9}, {325, 3}, {0, 1}, {359, 3}, {318, 3}, {0, 2},
+		{199, 3}, {0, 1}, {344, 3}, {0, 3}, {248, 3}, {0, 10}, {310, 3}, {0, 3},
+		{93, 6}, {0, 3}, {252, 3}, {157, 4}, {0, 2}, {273, 5}, {0, 14}, {99, 4},
+		{0, 1}, {464, 4}, {0, 2}, {92, 4}, {495, 3}, {0, 1}, {322, 4}, {16, 4},
+		{0, 3}, {402, 3}, {0, 2}, {237, 4}, {0, 2}, {432, 4}, {0, 1}, {483, 5},
+		{0, 2}, {294, 4}, {0, 2}, {306, 3}, {113, 5}, {0, 1}, {26, 4}, {164, 3},
+		{488, 4}, {0, 1}, {542, 3}, {248, 6}, {0, 5}, {205, 3}, {0, 8}, {48, 3},
+		{449, 6}, {0, 2}, {192, 3}, {328, 4}, {9, 5}, {433, 3}, {0, 3}, {622, 25},
+		{615, 5}, {46, 5}, {0, 2}, {104, 3}, {475, 10}, {549, 3}, {0, 4}, {597, 8},
+		{314, 3}, {0, 1}, {473, 6}, {317, 5}, {0, 1}, {400, 3}, {0, 3}, {109, 3},
+		{151, 3}, {48, 4}, {0, 4}, {125, 3}, {108, 3}, {0, 2},
+	}
+
+	var got, want bytes.Buffer
+	var dd dictDecoder
+	dd.init(1<<11, nil)
+
+	var writeCopy = func(dist, length int) {
+		for length > 0 {
+			cnt := dd.tryWriteCopy(dist, length)
+			if cnt == 0 {
+				cnt = dd.writeCopy(dist, length)
+			}
+
+			length -= cnt
+			if dd.availWrite() == 0 {
+				got.Write(dd.readFlush())
+			}
+		}
+	}
+	var writeString = func(str string) {
+		for len(str) > 0 {
+			cnt := copy(dd.writeSlice(), str)
+			str = str[cnt:]
+			dd.writeMark(cnt)
+			if dd.availWrite() == 0 {
+				got.Write(dd.readFlush())
+			}
+		}
+	}
+
+	writeString(".")
+	want.WriteByte('.')
+
+	str := poem
+	for _, ref := range poemRefs {
+		if ref.dist == 0 {
+			writeString(str[:ref.length])
+		} else {
+			writeCopy(ref.dist, ref.length)
+		}
+		str = str[ref.length:]
+	}
+	want.WriteString(poem)
+
+	writeCopy(dd.histSize(), 33)
+	want.Write(want.Bytes()[:33])
+
+	writeString(abc)
+	writeCopy(len(abc), 59*len(abc))
+	want.WriteString(strings.Repeat(abc, 60))
+
+	writeString(fox)
+	writeCopy(len(fox), 9*len(fox))
+	want.WriteString(strings.Repeat(fox, 10))
+
+	writeString(".")
+	writeCopy(1, 9)
+	want.WriteString(strings.Repeat(".", 10))
+
+	writeString(strings.ToUpper(poem))
+	writeCopy(len(poem), 7*len(poem))
+	want.WriteString(strings.Repeat(strings.ToUpper(poem), 8))
+
+	writeCopy(dd.histSize(), 10)
+	want.Write(want.Bytes()[want.Len()-dd.histSize():][:10])
+
+	got.Write(dd.readFlush())
+	if got.String() != want.String() {
+		t.Errorf("final string mismatch:\ngot  %q\nwant %q", got.String(), want.String())
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/flate_test.go
+++ b/vendor/github.com/klauspost/compress/flate/flate_test.go
@@ -0,0 +1,260 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// This test tests some internals of the flate package.
+// The tests in package compress/gzip serve as the
+// end-to-end test of the decompressor.
+
+package flate
+
+import (
+	"bytes"
+	"encoding/hex"
+	"io/ioutil"
+	"testing"
+)
+
+// The following test should not panic.
+func TestIssue5915(t *testing.T) {
+	bits := []int{4, 0, 0, 6, 4, 3, 2, 3, 3, 4, 4, 5, 0, 0, 0, 0, 5, 5, 6,
+		0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+		0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 8, 6, 0, 11, 0, 8, 0, 6, 6, 10, 8}
+	var h huffmanDecoder
+	if h.init(bits) {
+		t.Fatalf("Given sequence of bits is bad, and should not succeed.")
+	}
+}
+
+// The following test should not panic.
+func TestIssue5962(t *testing.T) {
+	bits := []int{4, 0, 0, 6, 4, 3, 2, 3, 3, 4, 4, 5, 0, 0, 0, 0,
+		5, 5, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11}
+	var h huffmanDecoder
+	if h.init(bits) {
+		t.Fatalf("Given sequence of bits is bad, and should not succeed.")
+	}
+}
+
+// The following test should not panic.
+func TestIssue6255(t *testing.T) {
+	bits1 := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 11}
+	bits2 := []int{11, 13}
+	var h huffmanDecoder
+	if !h.init(bits1) {
+		t.Fatalf("Given sequence of bits is good and should succeed.")
+	}
+	if h.init(bits2) {
+		t.Fatalf("Given sequence of bits is bad and should not succeed.")
+	}
+}
+
+func TestInvalidEncoding(t *testing.T) {
+	// Initialize Huffman decoder to recognize "0".
+	var h huffmanDecoder
+	if !h.init([]int{1}) {
+		t.Fatal("Failed to initialize Huffman decoder")
+	}
+
+	// Initialize decompressor with invalid Huffman coding.
+	var f decompressor
+	f.r = bytes.NewReader([]byte{0xff})
+
+	_, err := f.huffSym(&h)
+	if err == nil {
+		t.Fatal("Should have rejected invalid bit sequence")
+	}
+}
+
+func TestInvalidBits(t *testing.T) {
+	oversubscribed := []int{1, 2, 3, 4, 4, 5}
+	incomplete := []int{1, 2, 4, 4}
+	var h huffmanDecoder
+	if h.init(oversubscribed) {
+		t.Fatal("Should reject oversubscribed bit-length set")
+	}
+	if h.init(incomplete) {
+		t.Fatal("Should reject incomplete bit-length set")
+	}
+}
+
+func TestStreams(t *testing.T) {
+	// To verify any of these hexstrings as valid or invalid flate streams
+	// according to the C zlib library, you can use the Python wrapper library:
+	// >>> hex_string = "010100feff11"
+	// >>> import zlib
+	// >>> zlib.decompress(hex_string.decode("hex"), -15) # Negative means raw DEFLATE
+	// '\x11'
+
+	testCases := []struct {
+		desc   string // Description of the stream
+		stream string // Hexstring of the input DEFLATE stream
+		want   string // Expected result. Use "fail" to expect failure
+	}{{
+		"degenerate HCLenTree",
+		"05e0010000000000100000000000000000000000000000000000000000000000" +
+			"00000000000000000004",
+		"fail",
+	}, {
+		"complete HCLenTree, empty HLitTree, empty HDistTree",
+		"05e0010400000000000000000000000000000000000000000000000000000000" +
+			"00000000000000000010",
+		"fail",
+	}, {
+		"empty HCLenTree",
+		"05e0010000000000000000000000000000000000000000000000000000000000" +
+			"00000000000000000010",
+		"fail",
+	}, {
+		"complete HCLenTree, complete HLitTree, empty HDistTree, use missing HDist symbol",
+		"000100feff000de0010400000000100000000000000000000000000000000000" +
+			"0000000000000000000000000000002c",
+		"fail",
+	}, {
+		"complete HCLenTree, complete HLitTree, degenerate HDistTree, use missing HDist symbol",
+		"000100feff000de0010000000000000000000000000000000000000000000000" +
+			"00000000000000000610000000004070",
+		"fail",
+	}, {
+		"complete HCLenTree, empty HLitTree, empty HDistTree",
+		"05e0010400000000100400000000000000000000000000000000000000000000" +
+			"0000000000000000000000000008",
+		"fail",
+	}, {
+		"complete HCLenTree, empty HLitTree, degenerate HDistTree",
+		"05e0010400000000100400000000000000000000000000000000000000000000" +
+			"0000000000000000000800000008",
+		"fail",
+	}, {
+		"complete HCLenTree, degenerate HLitTree, degenerate HDistTree, use missing HLit symbol",
+		"05e0010400000000100000000000000000000000000000000000000000000000" +
+			"0000000000000000001c",
+		"fail",
+	}, {
+		"complete HCLenTree, complete HLitTree, too large HDistTree",
+		"edff870500000000200400000000000000000000000000000000000000000000" +
+			"000000000000000000080000000000000004",
+		"fail",
+	}, {
+		"complete HCLenTree, complete HLitTree, empty HDistTree, excessive repeater code",
+		"edfd870500000000200400000000000000000000000000000000000000000000" +
+			"000000000000000000e8b100",
+		"fail",
+	}, {
+		"complete HCLenTree, complete HLitTree, empty HDistTree of normal length 30",
+		"05fd01240000000000f8ffffffffffffffffffffffffffffffffffffffffffff" +
+			"ffffffffffffffffff07000000fe01",
+		"",
+	}, {
+		"complete HCLenTree, complete HLitTree, empty HDistTree of excessive length 31",
+		"05fe01240000000000f8ffffffffffffffffffffffffffffffffffffffffffff" +
+			"ffffffffffffffffff07000000fc03",
+		"fail",
+	}, {
+		"complete HCLenTree, over-subscribed HLitTree, empty HDistTree",
+		"05e001240000000000fcffffffffffffffffffffffffffffffffffffffffffff" +
+			"ffffffffffffffffff07f00f",
+		"fail",
+	}, {
+		"complete HCLenTree, under-subscribed HLitTree, empty HDistTree",
+		"05e001240000000000fcffffffffffffffffffffffffffffffffffffffffffff" +
+			"fffffffffcffffffff07f00f",
+		"fail",
+	}, {
+		"complete HCLenTree, complete HLitTree with single code, empty HDistTree",
+		"05e001240000000000f8ffffffffffffffffffffffffffffffffffffffffffff" +
+			"ffffffffffffffffff07f00f",
+		"01",
+	}, {
+		"complete HCLenTree, complete HLitTree with multiple codes, empty HDistTree",
+		"05e301240000000000f8ffffffffffffffffffffffffffffffffffffffffffff" +
+			"ffffffffffffffffff07807f",
+		"01",
+	}, {
+		"complete HCLenTree, complete HLitTree, degenerate HDistTree, use valid HDist symbol",
+		"000100feff000de0010400000000100000000000000000000000000000000000" +
+			"0000000000000000000000000000003c",
+		"00000000",
+	}, {
+		"complete HCLenTree, degenerate HLitTree, degenerate HDistTree",
+		"05e0010400000000100000000000000000000000000000000000000000000000" +
+			"0000000000000000000c",
+		"",
+	}, {
+		"complete HCLenTree, degenerate HLitTree, empty HDistTree",
+		"05e0010400000000100000000000000000000000000000000000000000000000" +
+			"00000000000000000004",
+		"",
+	}, {
+		"complete HCLenTree, complete HLitTree, empty HDistTree, spanning repeater code",
+		"edfd870500000000200400000000000000000000000000000000000000000000" +
+			"000000000000000000e8b000",
+		"",
+	}, {
+		"complete HCLenTree with length codes, complete HLitTree, empty HDistTree",
+		"ede0010400000000100000000000000000000000000000000000000000000000" +
+			"0000000000000000000400004000",
+		"",
+	}, {
+		"complete HCLenTree, complete HLitTree, degenerate HDistTree, use valid HLit symbol 284 with count 31",
+		"000100feff00ede0010400000000100000000000000000000000000000000000" +
+			"000000000000000000000000000000040000407f00",
+		"0000000000000000000000000000000000000000000000000000000000000000" +
+			"0000000000000000000000000000000000000000000000000000000000000000" +
+			"0000000000000000000000000000000000000000000000000000000000000000" +
+			"0000000000000000000000000000000000000000000000000000000000000000" +
+			"0000000000000000000000000000000000000000000000000000000000000000" +
+			"0000000000000000000000000000000000000000000000000000000000000000" +
+			"0000000000000000000000000000000000000000000000000000000000000000" +
+			"0000000000000000000000000000000000000000000000000000000000000000" +
+			"000000",
+	}, {
+		"complete HCLenTree, complete HLitTree, degenerate HDistTree, use valid HLit and HDist symbols",
+		"0cc2010d00000082b0ac4aff0eb07d27060000ffff",
+		"616263616263",
+	}, {
+		"fixed block, use reserved symbol 287",
+		"33180700",
+		"fail",
+	}, {
+		"raw block",
+		"010100feff11",
+		"11",
+	}, {
+		"issue 10426 - over-subscribed HCLenTree causes a hang",
+		"344c4a4e494d4b070000ff2e2eff2e2e2e2e2eff",
+		"fail",
+	}, {
+		"issue 11030 - empty HDistTree unexpectedly leads to error",
+		"05c0070600000080400fff37a0ca",
+		"",
+	}, {
+		"issue 11033 - empty HDistTree unexpectedly leads to error",
+		"050fb109c020cca5d017dcbca044881ee1034ec149c8980bbc413c2ab35be9dc" +
+			"b1473449922449922411202306ee97b0383a521b4ffdcf3217f9f7d3adb701",
+		"3130303634342068652e706870005d05355f7ed957ff084a90925d19e3ebc6d0" +
+			"c6d7",
+	}}
+
+	for i, tc := range testCases {
+		data, err := hex.DecodeString(tc.stream)
+		if err != nil {
+			t.Fatal(err)
+		}
+		data, err = ioutil.ReadAll(NewReader(bytes.NewReader(data)))
+		if tc.want == "fail" {
+			if err == nil {
+				t.Errorf("#%d (%s): got nil error, want non-nil", i, tc.desc)
+			}
+		} else {
+			if err != nil {
+				t.Errorf("#%d (%s): %v", i, tc.desc, err)
+				continue
+			}
+			if got := hex.EncodeToString(data); got != tc.want {
+				t.Errorf("#%d (%s):\ngot  %q\nwant %q", i, tc.desc, got, tc.want)
+			}
+
+		}
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/gen.go
+++ b/vendor/github.com/klauspost/compress/flate/gen.go
@@ -0,0 +1,265 @@
+// Copyright 2012 The Go Authors.  All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build ignore
+
+// This program generates fixedhuff.go
+// Invoke as
+//
+//	go run gen.go -output fixedhuff.go
+
+package main
+
+import (
+	"bytes"
+	"flag"
+	"fmt"
+	"go/format"
+	"io/ioutil"
+	"log"
+)
+
+var filename = flag.String("output", "fixedhuff.go", "output file name")
+
+const maxCodeLen = 16
+
+// Note: the definition of the huffmanDecoder struct is copied from
+// inflate.go, as it is private to the implementation.
+
+// chunk & 15 is number of bits
+// chunk >> 4 is value, including table link
+
+const (
+	huffmanChunkBits  = 9
+	huffmanNumChunks  = 1 << huffmanChunkBits
+	huffmanCountMask  = 15
+	huffmanValueShift = 4
+)
+
+type huffmanDecoder struct {
+	min      int                      // the minimum code length
+	chunks   [huffmanNumChunks]uint32 // chunks as described above
+	links    [][]uint32               // overflow links
+	linkMask uint32                   // mask the width of the link table
+}
+
+// Initialize Huffman decoding tables from array of code lengths.
+// Following this function, h is guaranteed to be initialized into a complete
+// tree (i.e., neither over-subscribed nor under-subscribed). The exception is a
+// degenerate case where the tree has only a single symbol with length 1. Empty
+// trees are permitted.
+func (h *huffmanDecoder) init(bits []int) bool {
+	// Sanity enables additional runtime tests during Huffman
+	// table construction.  It's intended to be used during
+	// development to supplement the currently ad-hoc unit tests.
+	const sanity = false
+
+	if h.min != 0 {
+		*h = huffmanDecoder{}
+	}
+
+	// Count number of codes of each length,
+	// compute min and max length.
+	var count [maxCodeLen]int
+	var min, max int
+	for _, n := range bits {
+		if n == 0 {
+			continue
+		}
+		if min == 0 || n < min {
+			min = n
+		}
+		if n > max {
+			max = n
+		}
+		count[n]++
+	}
+
+	// Empty tree. The decompressor.huffSym function will fail later if the tree
+	// is used. Technically, an empty tree is only valid for the HDIST tree and
+	// not the HCLEN and HLIT tree. However, a stream with an empty HCLEN tree
+	// is guaranteed to fail since it will attempt to use the tree to decode the
+	// codes for the HLIT and HDIST trees. Similarly, an empty HLIT tree is
+	// guaranteed to fail later since the compressed data section must be
+	// composed of at least one symbol (the end-of-block marker).
+	if max == 0 {
+		return true
+	}
+
+	code := 0
+	var nextcode [maxCodeLen]int
+	for i := min; i <= max; i++ {
+		code <<= 1
+		nextcode[i] = code
+		code += count[i]
+	}
+
+	// Check that the coding is complete (i.e., that we've
+	// assigned all 2-to-the-max possible bit sequences).
+	// Exception: To be compatible with zlib, we also need to
+	// accept degenerate single-code codings.  See also
+	// TestDegenerateHuffmanCoding.
+	if code != 1<<uint(max) && !(code == 1 && max == 1) {
+		return false
+	}
+
+	h.min = min
+	if max > huffmanChunkBits {
+		numLinks := 1 << (uint(max) - huffmanChunkBits)
+		h.linkMask = uint32(numLinks - 1)
+
+		// create link tables
+		link := nextcode[huffmanChunkBits+1] >> 1
+		h.links = make([][]uint32, huffmanNumChunks-link)
+		for j := uint(link); j < huffmanNumChunks; j++ {
+			reverse := int(reverseByte[j>>8]) | int(reverseByte[j&0xff])<<8
+			reverse >>= uint(16 - huffmanChunkBits)
+			off := j - uint(link)
+			if sanity && h.chunks[reverse] != 0 {
+				panic("impossible: overwriting existing chunk")
+			}
+			h.chunks[reverse] = uint32(off<<huffmanValueShift | (huffmanChunkBits + 1))
+			h.links[off] = make([]uint32, numLinks)
+		}
+	}
+
+	for i, n := range bits {
+		if n == 0 {
+			continue
+		}
+		code := nextcode[n]
+		nextcode[n]++
+		chunk := uint32(i<<huffmanValueShift | n)
+		reverse := int(reverseByte[code>>8]) | int(reverseByte[code&0xff])<<8
+		reverse >>= uint(16 - n)
+		if n <= huffmanChunkBits {
+			for off := reverse; off < len(h.chunks); off += 1 << uint(n) {
+				// We should never need to overwrite
+				// an existing chunk.  Also, 0 is
+				// never a valid chunk, because the
+				// lower 4 "count" bits should be
+				// between 1 and 15.
+				if sanity && h.chunks[off] != 0 {
+					panic("impossible: overwriting existing chunk")
+				}
+				h.chunks[off] = chunk
+			}
+		} else {
+			j := reverse & (huffmanNumChunks - 1)
+			if sanity && h.chunks[j]&huffmanCountMask != huffmanChunkBits+1 {
+				// Longer codes should have been
+				// associated with a link table above.
+				panic("impossible: not an indirect chunk")
+			}
+			value := h.chunks[j] >> huffmanValueShift
+			linktab := h.links[value]
+			reverse >>= huffmanChunkBits
+			for off := reverse; off < len(linktab); off += 1 << uint(n-huffmanChunkBits) {
+				if sanity && linktab[off] != 0 {
+					panic("impossible: overwriting existing chunk")
+				}
+				linktab[off] = chunk
+			}
+		}
+	}
+
+	if sanity {
+		// Above we've sanity checked that we never overwrote
+		// an existing entry.  Here we additionally check that
+		// we filled the tables completely.
+		for i, chunk := range h.chunks {
+			if chunk == 0 {
+				// As an exception, in the degenerate
+				// single-code case, we allow odd
+				// chunks to be missing.
+				if code == 1 && i%2 == 1 {
+					continue
+				}
+				panic("impossible: missing chunk")
+			}
+		}
+		for _, linktab := range h.links {
+			for _, chunk := range linktab {
+				if chunk == 0 {
+					panic("impossible: missing chunk")
+				}
+			}
+		}
+	}
+
+	return true
+}
+
+func main() {
+	flag.Parse()
+
+	var h huffmanDecoder
+	var bits [288]int
+	initReverseByte()
+	for i := 0; i < 144; i++ {
+		bits[i] = 8
+	}
+	for i := 144; i < 256; i++ {
+		bits[i] = 9
+	}
+	for i := 256; i < 280; i++ {
+		bits[i] = 7
+	}
+	for i := 280; i < 288; i++ {
+		bits[i] = 8
+	}
+	h.init(bits[:])
+	if h.links != nil {
+		log.Fatal("Unexpected links table in fixed Huffman decoder")
+	}
+
+	var buf bytes.Buffer
+
+	fmt.Fprintf(&buf, `// Copyright 2013 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.`+"\n\n")
+
+	fmt.Fprintln(&buf, "package flate")
+	fmt.Fprintln(&buf)
+	fmt.Fprintln(&buf, "// autogenerated by go run gen.go -output fixedhuff.go, DO NOT EDIT")
+	fmt.Fprintln(&buf)
+	fmt.Fprintln(&buf, "var fixedHuffmanDecoder = huffmanDecoder{")
+	fmt.Fprintf(&buf, "\t%d,\n", h.min)
+	fmt.Fprintln(&buf, "\t[huffmanNumChunks]uint32{")
+	for i := 0; i < huffmanNumChunks; i++ {
+		if i&7 == 0 {
+			fmt.Fprintf(&buf, "\t\t")
+		} else {
+			fmt.Fprintf(&buf, " ")
+		}
+		fmt.Fprintf(&buf, "0x%04x,", h.chunks[i])
+		if i&7 == 7 {
+			fmt.Fprintln(&buf)
+		}
+	}
+	fmt.Fprintln(&buf, "\t},")
+	fmt.Fprintln(&buf, "\tnil, 0,")
+	fmt.Fprintln(&buf, "}")
+
+	data, err := format.Source(buf.Bytes())
+	if err != nil {
+		log.Fatal(err)
+	}
+	err = ioutil.WriteFile(*filename, data, 0644)
+	if err != nil {
+		log.Fatal(err)
+	}
+}
+
+var reverseByte [256]byte
+
+func initReverseByte() {
+	for x := 0; x < 256; x++ {
+		var result byte
+		for i := uint(0); i < 8; i++ {
+			result |= byte(((x >> i) & 1) << (7 - i))
+		}
+		reverseByte[x] = result
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/huffman_bit_writer.go
+++ b/vendor/github.com/klauspost/compress/flate/huffman_bit_writer.go
@@ -0,0 +1,701 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import (
+	"io"
+)
+
+const (
+	// The largest offset code.
+	offsetCodeCount = 30
+
+	// The special code used to mark the end of a block.
+	endBlockMarker = 256
+
+	// The first length code.
+	lengthCodesStart = 257
+
+	// The number of codegen codes.
+	codegenCodeCount = 19
+	badCode          = 255
+
+	// bufferFlushSize indicates the buffer size
+	// after which bytes are flushed to the writer.
+	// Should preferably be a multiple of 6, since
+	// we accumulate 6 bytes between writes to the buffer.
+	bufferFlushSize = 240
+
+	// bufferSize is the actual output byte buffer size.
+	// It must have additional headroom for a flush
+	// which can contain up to 8 bytes.
+	bufferSize = bufferFlushSize + 8
+)
+
+// The number of extra bits needed by length code X - LENGTH_CODES_START.
+var lengthExtraBits = []int8{
+	/* 257 */ 0, 0, 0,
+	/* 260 */ 0, 0, 0, 0, 0, 1, 1, 1, 1, 2,
+	/* 270 */ 2, 2, 2, 3, 3, 3, 3, 4, 4, 4,
+	/* 280 */ 4, 5, 5, 5, 5, 0,
+}
+
+// The length indicated by length code X - LENGTH_CODES_START.
+var lengthBase = []uint32{
+	0, 1, 2, 3, 4, 5, 6, 7, 8, 10,
+	12, 14, 16, 20, 24, 28, 32, 40, 48, 56,
+	64, 80, 96, 112, 128, 160, 192, 224, 255,
+}
+
+// offset code word extra bits.
+var offsetExtraBits = []int8{
+	0, 0, 0, 0, 1, 1, 2, 2, 3, 3,
+	4, 4, 5, 5, 6, 6, 7, 7, 8, 8,
+	9, 9, 10, 10, 11, 11, 12, 12, 13, 13,
+	/* extended window */
+	14, 14, 15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 20, 20,
+}
+
+var offsetBase = []uint32{
+	/* normal deflate */
+	0x000000, 0x000001, 0x000002, 0x000003, 0x000004,
+	0x000006, 0x000008, 0x00000c, 0x000010, 0x000018,
+	0x000020, 0x000030, 0x000040, 0x000060, 0x000080,
+	0x0000c0, 0x000100, 0x000180, 0x000200, 0x000300,
+	0x000400, 0x000600, 0x000800, 0x000c00, 0x001000,
+	0x001800, 0x002000, 0x003000, 0x004000, 0x006000,
+
+	/* extended window */
+	0x008000, 0x00c000, 0x010000, 0x018000, 0x020000,
+	0x030000, 0x040000, 0x060000, 0x080000, 0x0c0000,
+	0x100000, 0x180000, 0x200000, 0x300000,
+}
+
+// The odd order in which the codegen code sizes are written.
+var codegenOrder = []uint32{16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15}
+
+type huffmanBitWriter struct {
+	// writer is the underlying writer.
+	// Do not use it directly; use the write method, which ensures
+	// that Write errors are sticky.
+	writer io.Writer
+
+	// Data waiting to be written is bytes[0:nbytes]
+	// and then the low nbits of bits.
+	bits            uint64
+	nbits           uint
+	bytes           [bufferSize]byte
+	codegenFreq     [codegenCodeCount]int32
+	nbytes          int
+	literalFreq     []int32
+	offsetFreq      []int32
+	codegen         []uint8
+	literalEncoding *huffmanEncoder
+	offsetEncoding  *huffmanEncoder
+	codegenEncoding *huffmanEncoder
+	err             error
+}
+
+func newHuffmanBitWriter(w io.Writer) *huffmanBitWriter {
+	return &huffmanBitWriter{
+		writer:          w,
+		literalFreq:     make([]int32, maxNumLit),
+		offsetFreq:      make([]int32, offsetCodeCount),
+		codegen:         make([]uint8, maxNumLit+offsetCodeCount+1),
+		literalEncoding: newHuffmanEncoder(maxNumLit),
+		codegenEncoding: newHuffmanEncoder(codegenCodeCount),
+		offsetEncoding:  newHuffmanEncoder(offsetCodeCount),
+	}
+}
+
+func (w *huffmanBitWriter) reset(writer io.Writer) {
+	w.writer = writer
+	w.bits, w.nbits, w.nbytes, w.err = 0, 0, 0, nil
+	w.bytes = [bufferSize]byte{}
+}
+
+func (w *huffmanBitWriter) flush() {
+	if w.err != nil {
+		w.nbits = 0
+		return
+	}
+	n := w.nbytes
+	for w.nbits != 0 {
+		w.bytes[n] = byte(w.bits)
+		w.bits >>= 8
+		if w.nbits > 8 { // Avoid underflow
+			w.nbits -= 8
+		} else {
+			w.nbits = 0
+		}
+		n++
+	}
+	w.bits = 0
+	w.write(w.bytes[:n])
+	w.nbytes = 0
+}
+
+func (w *huffmanBitWriter) write(b []byte) {
+	if w.err != nil {
+		return
+	}
+	_, w.err = w.writer.Write(b)
+}
+
+func (w *huffmanBitWriter) writeBits(b int32, nb uint) {
+	if w.err != nil {
+		return
+	}
+	w.bits |= uint64(b) << w.nbits
+	w.nbits += nb
+	if w.nbits >= 48 {
+		bits := w.bits
+		w.bits >>= 48
+		w.nbits -= 48
+		n := w.nbytes
+		bytes := w.bytes[n : n+6]
+		bytes[0] = byte(bits)
+		bytes[1] = byte(bits >> 8)
+		bytes[2] = byte(bits >> 16)
+		bytes[3] = byte(bits >> 24)
+		bytes[4] = byte(bits >> 32)
+		bytes[5] = byte(bits >> 40)
+		n += 6
+		if n >= bufferFlushSize {
+			w.write(w.bytes[:n])
+			n = 0
+		}
+		w.nbytes = n
+	}
+}
+
+func (w *huffmanBitWriter) writeBytes(bytes []byte) {
+	if w.err != nil {
+		return
+	}
+	n := w.nbytes
+	if w.nbits&7 != 0 {
+		w.err = InternalError("writeBytes with unfinished bits")
+		return
+	}
+	for w.nbits != 0 {
+		w.bytes[n] = byte(w.bits)
+		w.bits >>= 8
+		w.nbits -= 8
+		n++
+	}
+	if n != 0 {
+		w.write(w.bytes[:n])
+	}
+	w.nbytes = 0
+	w.write(bytes)
+}
+
+// RFC 1951 3.2.7 specifies a special run-length encoding for specifying
+// the literal and offset lengths arrays (which are concatenated into a single
+// array).  This method generates that run-length encoding.
+//
+// The result is written into the codegen array, and the frequencies
+// of each code is written into the codegenFreq array.
+// Codes 0-15 are single byte codes. Codes 16-18 are followed by additional
+// information. Code badCode is an end marker
+//
+//  numLiterals      The number of literals in literalEncoding
+//  numOffsets       The number of offsets in offsetEncoding
+//  litenc, offenc   The literal and offset encoder to use
+func (w *huffmanBitWriter) generateCodegen(numLiterals int, numOffsets int, litEnc, offEnc *huffmanEncoder) {
+	for i := range w.codegenFreq {
+		w.codegenFreq[i] = 0
+	}
+	// Note that we are using codegen both as a temporary variable for holding
+	// a copy of the frequencies, and as the place where we put the result.
+	// This is fine because the output is always shorter than the input used
+	// so far.
+	codegen := w.codegen // cache
+	// Copy the concatenated code sizes to codegen. Put a marker at the end.
+	cgnl := codegen[:numLiterals]
+	for i := range cgnl {
+		cgnl[i] = uint8(litEnc.codes[i].len)
+	}
+
+	cgnl = codegen[numLiterals : numLiterals+numOffsets]
+	for i := range cgnl {
+		cgnl[i] = uint8(offEnc.codes[i].len)
+	}
+	codegen[numLiterals+numOffsets] = badCode
+
+	size := codegen[0]
+	count := 1
+	outIndex := 0
+	for inIndex := 1; size != badCode; inIndex++ {
+		// INVARIANT: We have seen "count" copies of size that have not yet
+		// had output generated for them.
+		nextSize := codegen[inIndex]
+		if nextSize == size {
+			count++
+			continue
+		}
+		// We need to generate codegen indicating "count" of size.
+		if size != 0 {
+			codegen[outIndex] = size
+			outIndex++
+			w.codegenFreq[size]++
+			count--
+			for count >= 3 {
+				n := 6
+				if n > count {
+					n = count
+				}
+				codegen[outIndex] = 16
+				outIndex++
+				codegen[outIndex] = uint8(n - 3)
+				outIndex++
+				w.codegenFreq[16]++
+				count -= n
+			}
+		} else {
+			for count >= 11 {
+				n := 138
+				if n > count {
+					n = count
+				}
+				codegen[outIndex] = 18
+				outIndex++
+				codegen[outIndex] = uint8(n - 11)
+				outIndex++
+				w.codegenFreq[18]++
+				count -= n
+			}
+			if count >= 3 {
+				// count >= 3 && count <= 10
+				codegen[outIndex] = 17
+				outIndex++
+				codegen[outIndex] = uint8(count - 3)
+				outIndex++
+				w.codegenFreq[17]++
+				count = 0
+			}
+		}
+		count--
+		for ; count >= 0; count-- {
+			codegen[outIndex] = size
+			outIndex++
+			w.codegenFreq[size]++
+		}
+		// Set up invariant for next time through the loop.
+		size = nextSize
+		count = 1
+	}
+	// Marker indicating the end of the codegen.
+	codegen[outIndex] = badCode
+}
+
+// dynamicSize returns the size of dynamically encoded data in bits.
+func (w *huffmanBitWriter) dynamicSize(litEnc, offEnc *huffmanEncoder, extraBits int) (size, numCodegens int) {
+	numCodegens = len(w.codegenFreq)
+	for numCodegens > 4 && w.codegenFreq[codegenOrder[numCodegens-1]] == 0 {
+		numCodegens--
+	}
+	header := 3 + 5 + 5 + 4 + (3 * numCodegens) +
+		w.codegenEncoding.bitLength(w.codegenFreq[:]) +
+		int(w.codegenFreq[16])*2 +
+		int(w.codegenFreq[17])*3 +
+		int(w.codegenFreq[18])*7
+	size = header +
+		litEnc.bitLength(w.literalFreq) +
+		offEnc.bitLength(w.offsetFreq) +
+		extraBits
+
+	return size, numCodegens
+}
+
+// fixedSize returns the size of dynamically encoded data in bits.
+func (w *huffmanBitWriter) fixedSize(extraBits int) int {
+	return 3 +
+		fixedLiteralEncoding.bitLength(w.literalFreq) +
+		fixedOffsetEncoding.bitLength(w.offsetFreq) +
+		extraBits
+}
+
+// storedSize calculates the stored size, including header.
+// The function returns the size in bits and whether the block
+// fits inside a single block.
+func (w *huffmanBitWriter) storedSize(in []byte) (int, bool) {
+	if in == nil {
+		return 0, false
+	}
+	if len(in) <= maxStoreBlockSize {
+		return (len(in) + 5) * 8, true
+	}
+	return 0, false
+}
+
+func (w *huffmanBitWriter) writeCode(c hcode) {
+	if w.err != nil {
+		return
+	}
+	w.bits |= uint64(c.code) << w.nbits
+	w.nbits += uint(c.len)
+	if w.nbits >= 48 {
+		bits := w.bits
+		w.bits >>= 48
+		w.nbits -= 48
+		n := w.nbytes
+		bytes := w.bytes[n : n+6]
+		bytes[0] = byte(bits)
+		bytes[1] = byte(bits >> 8)
+		bytes[2] = byte(bits >> 16)
+		bytes[3] = byte(bits >> 24)
+		bytes[4] = byte(bits >> 32)
+		bytes[5] = byte(bits >> 40)
+		n += 6
+		if n >= bufferFlushSize {
+			w.write(w.bytes[:n])
+			n = 0
+		}
+		w.nbytes = n
+	}
+}
+
+// Write the header of a dynamic Huffman block to the output stream.
+//
+//  numLiterals  The number of literals specified in codegen
+//  numOffsets   The number of offsets specified in codegen
+//  numCodegens  The number of codegens used in codegen
+func (w *huffmanBitWriter) writeDynamicHeader(numLiterals int, numOffsets int, numCodegens int, isEof bool) {
+	if w.err != nil {
+		return
+	}
+	var firstBits int32 = 4
+	if isEof {
+		firstBits = 5
+	}
+	w.writeBits(firstBits, 3)
+	w.writeBits(int32(numLiterals-257), 5)
+	w.writeBits(int32(numOffsets-1), 5)
+	w.writeBits(int32(numCodegens-4), 4)
+
+	for i := 0; i < numCodegens; i++ {
+		value := uint(w.codegenEncoding.codes[codegenOrder[i]].len)
+		w.writeBits(int32(value), 3)
+	}
+
+	i := 0
+	for {
+		var codeWord int = int(w.codegen[i])
+		i++
+		if codeWord == badCode {
+			break
+		}
+		w.writeCode(w.codegenEncoding.codes[uint32(codeWord)])
+
+		switch codeWord {
+		case 16:
+			w.writeBits(int32(w.codegen[i]), 2)
+			i++
+			break
+		case 17:
+			w.writeBits(int32(w.codegen[i]), 3)
+			i++
+			break
+		case 18:
+			w.writeBits(int32(w.codegen[i]), 7)
+			i++
+			break
+		}
+	}
+}
+
+func (w *huffmanBitWriter) writeStoredHeader(length int, isEof bool) {
+	if w.err != nil {
+		return
+	}
+	var flag int32
+	if isEof {
+		flag = 1
+	}
+	w.writeBits(flag, 3)
+	w.flush()
+	w.writeBits(int32(length), 16)
+	w.writeBits(int32(^uint16(length)), 16)
+}
+
+func (w *huffmanBitWriter) writeFixedHeader(isEof bool) {
+	if w.err != nil {
+		return
+	}
+	// Indicate that we are a fixed Huffman block
+	var value int32 = 2
+	if isEof {
+		value = 3
+	}
+	w.writeBits(value, 3)
+}
+
+// writeBlock will write a block of tokens with the smallest encoding.
+// The original input can be supplied, and if the huffman encoded data
+// is larger than the original bytes, the data will be written as a
+// stored block.
+// If the input is nil, the tokens will always be Huffman encoded.
+func (w *huffmanBitWriter) writeBlock(tokens []token, eof bool, input []byte) {
+	if w.err != nil {
+		return
+	}
+
+	tokens = append(tokens, endBlockMarker)
+	numLiterals, numOffsets := w.indexTokens(tokens)
+
+	var extraBits int
+	storedSize, storable := w.storedSize(input)
+	if storable {
+		// We only bother calculating the costs of the extra bits required by
+		// the length of offset fields (which will be the same for both fixed
+		// and dynamic encoding), if we need to compare those two encodings
+		// against stored encoding.
+		for lengthCode := lengthCodesStart + 8; lengthCode < numLiterals; lengthCode++ {
+			// First eight length codes have extra size = 0.
+			extraBits += int(w.literalFreq[lengthCode]) * int(lengthExtraBits[lengthCode-lengthCodesStart])
+		}
+		for offsetCode := 4; offsetCode < numOffsets; offsetCode++ {
+			// First four offset codes have extra size = 0.
+			extraBits += int(w.offsetFreq[offsetCode]) * int(offsetExtraBits[offsetCode])
+		}
+	}
+
+	// Figure out smallest code.
+	// Fixed Huffman baseline.
+	var literalEncoding = fixedLiteralEncoding
+	var offsetEncoding = fixedOffsetEncoding
+	var size = w.fixedSize(extraBits)
+
+	// Dynamic Huffman?
+	var numCodegens int
+
+	// Generate codegen and codegenFrequencies, which indicates how to encode
+	// the literalEncoding and the offsetEncoding.
+	w.generateCodegen(numLiterals, numOffsets, w.literalEncoding, w.offsetEncoding)
+	w.codegenEncoding.generate(w.codegenFreq[:], 7)
+	dynamicSize, numCodegens := w.dynamicSize(w.literalEncoding, w.offsetEncoding, extraBits)
+
+	if dynamicSize < size {
+		size = dynamicSize
+		literalEncoding = w.literalEncoding
+		offsetEncoding = w.offsetEncoding
+	}
+
+	// Stored bytes?
+	if storable && storedSize < size {
+		w.writeStoredHeader(len(input), eof)
+		w.writeBytes(input)
+		return
+	}
+
+	// Huffman.
+	if literalEncoding == fixedLiteralEncoding {
+		w.writeFixedHeader(eof)
+	} else {
+		w.writeDynamicHeader(numLiterals, numOffsets, numCodegens, eof)
+	}
+
+	// Write the tokens.
+	w.writeTokens(tokens, literalEncoding.codes, offsetEncoding.codes)
+}
+
+// writeBlockDynamic encodes a block using a dynamic Huffman table.
+// This should be used if the symbols used have a disproportionate
+// histogram distribution.
+// If input is supplied and the compression savings are below 1/16th of the
+// input size the block is stored.
+func (w *huffmanBitWriter) writeBlockDynamic(tokens []token, eof bool, input []byte) {
+	if w.err != nil {
+		return
+	}
+
+	tokens = append(tokens, endBlockMarker)
+	numLiterals, numOffsets := w.indexTokens(tokens)
+
+	// Generate codegen and codegenFrequencies, which indicates how to encode
+	// the literalEncoding and the offsetEncoding.
+	w.generateCodegen(numLiterals, numOffsets, w.literalEncoding, w.offsetEncoding)
+	w.codegenEncoding.generate(w.codegenFreq[:], 7)
+	size, numCodegens := w.dynamicSize(w.literalEncoding, w.offsetEncoding, 0)
+
+	// Store bytes, if we don't get a reasonable improvement.
+	if ssize, storable := w.storedSize(input); storable && ssize < (size+size>>4) {
+		w.writeStoredHeader(len(input), eof)
+		w.writeBytes(input)
+		return
+	}
+
+	// Write Huffman table.
+	w.writeDynamicHeader(numLiterals, numOffsets, numCodegens, eof)
+
+	// Write the tokens.
+	w.writeTokens(tokens, w.literalEncoding.codes, w.offsetEncoding.codes)
+}
+
+// indexTokens indexes a slice of tokens, and updates
+// literalFreq and offsetFreq, and generates literalEncoding
+// and offsetEncoding.
+// The number of literal and offset tokens is returned.
+func (w *huffmanBitWriter) indexTokens(tokens []token) (numLiterals, numOffsets int) {
+	for i := range w.literalFreq {
+		w.literalFreq[i] = 0
+	}
+	for i := range w.offsetFreq {
+		w.offsetFreq[i] = 0
+	}
+
+	for _, t := range tokens {
+		if t < matchType {
+			w.literalFreq[t.literal()]++
+			continue
+		}
+		length := t.length()
+		offset := t.offset()
+		w.literalFreq[lengthCodesStart+lengthCode(length)]++
+		w.offsetFreq[offsetCode(offset)]++
+	}
+
+	// get the number of literals
+	numLiterals = len(w.literalFreq)
+	for w.literalFreq[numLiterals-1] == 0 {
+		numLiterals--
+	}
+	// get the number of offsets
+	numOffsets = len(w.offsetFreq)
+	for numOffsets > 0 && w.offsetFreq[numOffsets-1] == 0 {
+		numOffsets--
+	}
+	if numOffsets == 0 {
+		// We haven't found a single match. If we want to go with the dynamic encoding,
+		// we should count at least one offset to be sure that the offset huffman tree could be encoded.
+		w.offsetFreq[0] = 1
+		numOffsets = 1
+	}
+	w.literalEncoding.generate(w.literalFreq, 15)
+	w.offsetEncoding.generate(w.offsetFreq, 15)
+	return
+}
+
+// writeTokens writes a slice of tokens to the output.
+// codes for literal and offset encoding must be supplied.
+func (w *huffmanBitWriter) writeTokens(tokens []token, leCodes, oeCodes []hcode) {
+	if w.err != nil {
+		return
+	}
+	for _, t := range tokens {
+		if t < matchType {
+			w.writeCode(leCodes[t.literal()])
+			continue
+		}
+		// Write the length
+		length := t.length()
+		lengthCode := lengthCode(length)
+		w.writeCode(leCodes[lengthCode+lengthCodesStart])
+		extraLengthBits := uint(lengthExtraBits[lengthCode])
+		if extraLengthBits > 0 {
+			extraLength := int32(length - lengthBase[lengthCode])
+			w.writeBits(extraLength, extraLengthBits)
+		}
+		// Write the offset
+		offset := t.offset()
+		offsetCode := offsetCode(offset)
+		w.writeCode(oeCodes[offsetCode])
+		extraOffsetBits := uint(offsetExtraBits[offsetCode])
+		if extraOffsetBits > 0 {
+			extraOffset := int32(offset - offsetBase[offsetCode])
+			w.writeBits(extraOffset, extraOffsetBits)
+		}
+	}
+}
+
+// huffOffset is a static offset encoder used for huffman only encoding.
+// It can be reused since we will not be encoding offset values.
+var huffOffset *huffmanEncoder
+
+func init() {
+	w := newHuffmanBitWriter(nil)
+	w.offsetFreq[0] = 1
+	huffOffset = newHuffmanEncoder(offsetCodeCount)
+	huffOffset.generate(w.offsetFreq, 15)
+}
+
+// writeBlockHuff encodes a block of bytes as either
+// Huffman encoded literals or uncompressed bytes if the
+// results only gains very little from compression.
+func (w *huffmanBitWriter) writeBlockHuff(eof bool, input []byte) {
+	if w.err != nil {
+		return
+	}
+
+	// Clear histogram
+	for i := range w.literalFreq {
+		w.literalFreq[i] = 0
+	}
+
+	// Add everything as literals
+	histogram(input, w.literalFreq)
+
+	w.literalFreq[endBlockMarker] = 1
+
+	const numLiterals = endBlockMarker + 1
+	const numOffsets = 1
+
+	w.literalEncoding.generate(w.literalFreq, 15)
+
+	// Figure out smallest code.
+	// Always use dynamic Huffman or Store
+	var numCodegens int
+
+	// Generate codegen and codegenFrequencies, which indicates how to encode
+	// the literalEncoding and the offsetEncoding.
+	w.generateCodegen(numLiterals, numOffsets, w.literalEncoding, huffOffset)
+	w.codegenEncoding.generate(w.codegenFreq[:], 7)
+	size, numCodegens := w.dynamicSize(w.literalEncoding, huffOffset, 0)
+
+	// Store bytes, if we don't get a reasonable improvement.
+	if ssize, storable := w.storedSize(input); storable && ssize < (size+size>>4) {
+		w.writeStoredHeader(len(input), eof)
+		w.writeBytes(input)
+		return
+	}
+
+	// Huffman.
+	w.writeDynamicHeader(numLiterals, numOffsets, numCodegens, eof)
+	encoding := w.literalEncoding.codes[:257]
+	n := w.nbytes
+	for _, t := range input {
+		// Bitwriting inlined, ~30% speedup
+		c := encoding[t]
+		w.bits |= uint64(c.code) << w.nbits
+		w.nbits += uint(c.len)
+		if w.nbits < 48 {
+			continue
+		}
+		// Store 6 bytes
+		bits := w.bits
+		w.bits >>= 48
+		w.nbits -= 48
+		bytes := w.bytes[n : n+6]
+		bytes[0] = byte(bits)
+		bytes[1] = byte(bits >> 8)
+		bytes[2] = byte(bits >> 16)
+		bytes[3] = byte(bits >> 24)
+		bytes[4] = byte(bits >> 32)
+		bytes[5] = byte(bits >> 40)
+		n += 6
+		if n < bufferFlushSize {
+			continue
+		}
+		w.write(w.bytes[:n])
+		if w.err != nil {
+			return // Return early in the event of write failures
+		}
+		n = 0
+	}
+	w.nbytes = n
+	w.writeCode(encoding[endBlockMarker])
+}
--- a/vendor/github.com/klauspost/compress/flate/huffman_bit_writer_test.go
+++ b/vendor/github.com/klauspost/compress/flate/huffman_bit_writer_test.go
--- a/vendor/github.com/klauspost/compress/flate/huffman_code.go
+++ b/vendor/github.com/klauspost/compress/flate/huffman_code.go
@@ -0,0 +1,344 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import (
+	"math"
+	"sort"
+)
+
+// hcode is a huffman code with a bit code and bit length.
+type hcode struct {
+	code, len uint16
+}
+
+type huffmanEncoder struct {
+	codes     []hcode
+	freqcache []literalNode
+	bitCount  [17]int32
+	lns       byLiteral // stored to avoid repeated allocation in generate
+	lfs       byFreq    // stored to avoid repeated allocation in generate
+}
+
+type literalNode struct {
+	literal uint16
+	freq    int32
+}
+
+// A levelInfo describes the state of the constructed tree for a given depth.
+type levelInfo struct {
+	// Our level.  for better printing
+	level int32
+
+	// The frequency of the last node at this level
+	lastFreq int32
+
+	// The frequency of the next character to add to this level
+	nextCharFreq int32
+
+	// The frequency of the next pair (from level below) to add to this level.
+	// Only valid if the "needed" value of the next lower level is 0.
+	nextPairFreq int32
+
+	// The number of chains remaining to generate for this level before moving
+	// up to the next level
+	needed int32
+}
+
+// set sets the code and length of an hcode.
+func (h *hcode) set(code uint16, length uint16) {
+	h.len = length
+	h.code = code
+}
+
+func maxNode() literalNode { return literalNode{math.MaxUint16, math.MaxInt32} }
+
+func newHuffmanEncoder(size int) *huffmanEncoder {
+	return &huffmanEncoder{codes: make([]hcode, size)}
+}
+
+// Generates a HuffmanCode corresponding to the fixed literal table
+func generateFixedLiteralEncoding() *huffmanEncoder {
+	h := newHuffmanEncoder(maxNumLit)
+	codes := h.codes
+	var ch uint16
+	for ch = 0; ch < maxNumLit; ch++ {
+		var bits uint16
+		var size uint16
+		switch {
+		case ch < 144:
+			// size 8, 000110000  .. 10111111
+			bits = ch + 48
+			size = 8
+			break
+		case ch < 256:
+			// size 9, 110010000 .. 111111111
+			bits = ch + 400 - 144
+			size = 9
+			break
+		case ch < 280:
+			// size 7, 0000000 .. 0010111
+			bits = ch - 256
+			size = 7
+			break
+		default:
+			// size 8, 11000000 .. 11000111
+			bits = ch + 192 - 280
+			size = 8
+		}
+		codes[ch] = hcode{code: reverseBits(bits, byte(size)), len: size}
+	}
+	return h
+}
+
+func generateFixedOffsetEncoding() *huffmanEncoder {
+	h := newHuffmanEncoder(30)
+	codes := h.codes
+	for ch := range codes {
+		codes[ch] = hcode{code: reverseBits(uint16(ch), 5), len: 5}
+	}
+	return h
+}
+
+var fixedLiteralEncoding *huffmanEncoder = generateFixedLiteralEncoding()
+var fixedOffsetEncoding *huffmanEncoder = generateFixedOffsetEncoding()
+
+func (h *huffmanEncoder) bitLength(freq []int32) int {
+	var total int
+	for i, f := range freq {
+		if f != 0 {
+			total += int(f) * int(h.codes[i].len)
+		}
+	}
+	return total
+}
+
+const maxBitsLimit = 16
+
+// Return the number of literals assigned to each bit size in the Huffman encoding
+//
+// This method is only called when list.length >= 3
+// The cases of 0, 1, and 2 literals are handled by special case code.
+//
+// list  An array of the literals with non-zero frequencies
+//             and their associated frequencies. The array is in order of increasing
+//             frequency, and has as its last element a special element with frequency
+//             MaxInt32
+// maxBits     The maximum number of bits that should be used to encode any literal.
+//             Must be less than 16.
+// return      An integer array in which array[i] indicates the number of literals
+//             that should be encoded in i bits.
+func (h *huffmanEncoder) bitCounts(list []literalNode, maxBits int32) []int32 {
+	if maxBits >= maxBitsLimit {
+		panic("flate: maxBits too large")
+	}
+	n := int32(len(list))
+	list = list[0 : n+1]
+	list[n] = maxNode()
+
+	// The tree can't have greater depth than n - 1, no matter what. This
+	// saves a little bit of work in some small cases
+	if maxBits > n-1 {
+		maxBits = n - 1
+	}
+
+	// Create information about each of the levels.
+	// A bogus "Level 0" whose sole purpose is so that
+	// level1.prev.needed==0.  This makes level1.nextPairFreq
+	// be a legitimate value that never gets chosen.
+	var levels [maxBitsLimit]levelInfo
+	// leafCounts[i] counts the number of literals at the left
+	// of ancestors of the rightmost node at level i.
+	// leafCounts[i][j] is the number of literals at the left
+	// of the level j ancestor.
+	var leafCounts [maxBitsLimit][maxBitsLimit]int32
+
+	for level := int32(1); level <= maxBits; level++ {
+		// For every level, the first two items are the first two characters.
+		// We initialize the levels as if we had already figured this out.
+		levels[level] = levelInfo{
+			level:        level,
+			lastFreq:     list[1].freq,
+			nextCharFreq: list[2].freq,
+			nextPairFreq: list[0].freq + list[1].freq,
+		}
+		leafCounts[level][level] = 2
+		if level == 1 {
+			levels[level].nextPairFreq = math.MaxInt32
+		}
+	}
+
+	// We need a total of 2*n - 2 items at top level and have already generated 2.
+	levels[maxBits].needed = 2*n - 4
+
+	level := maxBits
+	for {
+		l := &levels[level]
+		if l.nextPairFreq == math.MaxInt32 && l.nextCharFreq == math.MaxInt32 {
+			// We've run out of both leafs and pairs.
+			// End all calculations for this level.
+			// To make sure we never come back to this level or any lower level,
+			// set nextPairFreq impossibly large.
+			l.needed = 0
+			levels[level+1].nextPairFreq = math.MaxInt32
+			level++
+			continue
+		}
+
+		prevFreq := l.lastFreq
+		if l.nextCharFreq < l.nextPairFreq {
+			// The next item on this row is a leaf node.
+			n := leafCounts[level][level] + 1
+			l.lastFreq = l.nextCharFreq
+			// Lower leafCounts are the same of the previous node.
+			leafCounts[level][level] = n
+			l.nextCharFreq = list[n].freq
+		} else {
+			// The next item on this row is a pair from the previous row.
+			// nextPairFreq isn't valid until we generate two
+			// more values in the level below
+			l.lastFreq = l.nextPairFreq
+			// Take leaf counts from the lower level, except counts[level] remains the same.
+			copy(leafCounts[level][:level], leafCounts[level-1][:level])
+			levels[l.level-1].needed = 2
+		}
+
+		if l.needed--; l.needed == 0 {
+			// We've done everything we need to do for this level.
+			// Continue calculating one level up. Fill in nextPairFreq
+			// of that level with the sum of the two nodes we've just calculated on
+			// this level.
+			if l.level == maxBits {
+				// All done!
+				break
+			}
+			levels[l.level+1].nextPairFreq = prevFreq + l.lastFreq
+			level++
+		} else {
+			// If we stole from below, move down temporarily to replenish it.
+			for levels[level-1].needed > 0 {
+				level--
+			}
+		}
+	}
+
+	// Somethings is wrong if at the end, the top level is null or hasn't used
+	// all of the leaves.
+	if leafCounts[maxBits][maxBits] != n {
+		panic("leafCounts[maxBits][maxBits] != n")
+	}
+
+	bitCount := h.bitCount[:maxBits+1]
+	bits := 1
+	counts := &leafCounts[maxBits]
+	for level := maxBits; level > 0; level-- {
+		// chain.leafCount gives the number of literals requiring at least "bits"
+		// bits to encode.
+		bitCount[bits] = counts[level] - counts[level-1]
+		bits++
+	}
+	return bitCount
+}
+
+// Look at the leaves and assign them a bit count and an encoding as specified
+// in RFC 1951 3.2.2
+func (h *huffmanEncoder) assignEncodingAndSize(bitCount []int32, list []literalNode) {
+	code := uint16(0)
+	for n, bits := range bitCount {
+		code <<= 1
+		if n == 0 || bits == 0 {
+			continue
+		}
+		// The literals list[len(list)-bits] .. list[len(list)-bits]
+		// are encoded using "bits" bits, and get the values
+		// code, code + 1, ....  The code values are
+		// assigned in literal order (not frequency order).
+		chunk := list[len(list)-int(bits):]
+
+		h.lns.sort(chunk)
+		for _, node := range chunk {
+			h.codes[node.literal] = hcode{code: reverseBits(code, uint8(n)), len: uint16(n)}
+			code++
+		}
+		list = list[0 : len(list)-int(bits)]
+	}
+}
+
+// Update this Huffman Code object to be the minimum code for the specified frequency count.
+//
+// freq  An array of frequencies, in which frequency[i] gives the frequency of literal i.
+// maxBits  The maximum number of bits to use for any literal.
+func (h *huffmanEncoder) generate(freq []int32, maxBits int32) {
+	if h.freqcache == nil {
+		// Allocate a reusable buffer with the longest possible frequency table.
+		// Possible lengths are codegenCodeCount, offsetCodeCount and maxNumLit.
+		// The largest of these is maxNumLit, so we allocate for that case.
+		h.freqcache = make([]literalNode, maxNumLit+1)
+	}
+	list := h.freqcache[:len(freq)+1]
+	// Number of non-zero literals
+	count := 0
+	// Set list to be the set of all non-zero literals and their frequencies
+	for i, f := range freq {
+		if f != 0 {
+			list[count] = literalNode{uint16(i), f}
+			count++
+		} else {
+			list[count] = literalNode{}
+			h.codes[i].len = 0
+		}
+	}
+	list[len(freq)] = literalNode{}
+
+	list = list[:count]
+	if count <= 2 {
+		// Handle the small cases here, because they are awkward for the general case code. With
+		// two or fewer literals, everything has bit length 1.
+		for i, node := range list {
+			// "list" is in order of increasing literal value.
+			h.codes[node.literal].set(uint16(i), 1)
+		}
+		return
+	}
+	h.lfs.sort(list)
+
+	// Get the number of literals for each bit count
+	bitCount := h.bitCounts(list, maxBits)
+	// And do the assignment
+	h.assignEncodingAndSize(bitCount, list)
+}
+
+type byLiteral []literalNode
+
+func (s *byLiteral) sort(a []literalNode) {
+	*s = byLiteral(a)
+	sort.Sort(s)
+}
+
+func (s byLiteral) Len() int { return len(s) }
+
+func (s byLiteral) Less(i, j int) bool {
+	return s[i].literal < s[j].literal
+}
+
+func (s byLiteral) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
+
+type byFreq []literalNode
+
+func (s *byFreq) sort(a []literalNode) {
+	*s = byFreq(a)
+	sort.Sort(s)
+}
+
+func (s byFreq) Len() int { return len(s) }
+
+func (s byFreq) Less(i, j int) bool {
+	if s[i].freq == s[j].freq {
+		return s[i].literal < s[j].literal
+	}
+	return s[i].freq < s[j].freq
+}
+
+func (s byFreq) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
--- a/vendor/github.com/klauspost/compress/flate/inflate.go
+++ b/vendor/github.com/klauspost/compress/flate/inflate.go
@@ -0,0 +1,846 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// Package flate implements the DEFLATE compressed data format, described in
+// RFC 1951.  The gzip and zlib packages implement access to DEFLATE-based file
+// formats.
+package flate
+
+import (
+	"bufio"
+	"io"
+	"strconv"
+	"sync"
+)
+
+const (
+	maxCodeLen = 16 // max length of Huffman code
+	// The next three numbers come from the RFC section 3.2.7, with the
+	// additional proviso in section 3.2.5 which implies that distance codes
+	// 30 and 31 should never occur in compressed data.
+	maxNumLit  = 286
+	maxNumDist = 30
+	numCodes   = 19 // number of codes in Huffman meta-code
+)
+
+// Initialize the fixedHuffmanDecoder only once upon first use.
+var fixedOnce sync.Once
+var fixedHuffmanDecoder huffmanDecoder
+
+// A CorruptInputError reports the presence of corrupt input at a given offset.
+type CorruptInputError int64
+
+func (e CorruptInputError) Error() string {
+	return "flate: corrupt input before offset " + strconv.FormatInt(int64(e), 10)
+}
+
+// An InternalError reports an error in the flate code itself.
+type InternalError string
+
+func (e InternalError) Error() string { return "flate: internal error: " + string(e) }
+
+// A ReadError reports an error encountered while reading input.
+//
+// Deprecated: No longer returned.
+type ReadError struct {
+	Offset int64 // byte offset where error occurred
+	Err    error // error returned by underlying Read
+}
+
+func (e *ReadError) Error() string {
+	return "flate: read error at offset " + strconv.FormatInt(e.Offset, 10) + ": " + e.Err.Error()
+}
+
+// A WriteError reports an error encountered while writing output.
+//
+// Deprecated: No longer returned.
+type WriteError struct {
+	Offset int64 // byte offset where error occurred
+	Err    error // error returned by underlying Write
+}
+
+func (e *WriteError) Error() string {
+	return "flate: write error at offset " + strconv.FormatInt(e.Offset, 10) + ": " + e.Err.Error()
+}
+
+// Resetter resets a ReadCloser returned by NewReader or NewReaderDict to
+// to switch to a new underlying Reader. This permits reusing a ReadCloser
+// instead of allocating a new one.
+type Resetter interface {
+	// Reset discards any buffered data and resets the Resetter as if it was
+	// newly initialized with the given reader.
+	Reset(r io.Reader, dict []byte) error
+}
+
+// The data structure for decoding Huffman tables is based on that of
+// zlib. There is a lookup table of a fixed bit width (huffmanChunkBits),
+// For codes smaller than the table width, there are multiple entries
+// (each combination of trailing bits has the same value). For codes
+// larger than the table width, the table contains a link to an overflow
+// table. The width of each entry in the link table is the maximum code
+// size minus the chunk width.
+//
+// Note that you can do a lookup in the table even without all bits
+// filled. Since the extra bits are zero, and the DEFLATE Huffman codes
+// have the property that shorter codes come before longer ones, the
+// bit length estimate in the result is a lower bound on the actual
+// number of bits.
+//
+// See the following:
+//	http://www.gzip.org/algorithm.txt
+
+// chunk & 15 is number of bits
+// chunk >> 4 is value, including table link
+
+const (
+	huffmanChunkBits  = 9
+	huffmanNumChunks  = 1 << huffmanChunkBits
+	huffmanCountMask  = 15
+	huffmanValueShift = 4
+)
+
+type huffmanDecoder struct {
+	min      int                      // the minimum code length
+	chunks   [huffmanNumChunks]uint32 // chunks as described above
+	links    [][]uint32               // overflow links
+	linkMask uint32                   // mask the width of the link table
+}
+
+// Initialize Huffman decoding tables from array of code lengths.
+// Following this function, h is guaranteed to be initialized into a complete
+// tree (i.e., neither over-subscribed nor under-subscribed). The exception is a
+// degenerate case where the tree has only a single symbol with length 1. Empty
+// trees are permitted.
+func (h *huffmanDecoder) init(bits []int) bool {
+	// Sanity enables additional runtime tests during Huffman
+	// table construction. It's intended to be used during
+	// development to supplement the currently ad-hoc unit tests.
+	const sanity = false
+
+	if h.min != 0 {
+		*h = huffmanDecoder{}
+	}
+
+	// Count number of codes of each length,
+	// compute min and max length.
+	var count [maxCodeLen]int
+	var min, max int
+	for _, n := range bits {
+		if n == 0 {
+			continue
+		}
+		if min == 0 || n < min {
+			min = n
+		}
+		if n > max {
+			max = n
+		}
+		count[n]++
+	}
+
+	// Empty tree. The decompressor.huffSym function will fail later if the tree
+	// is used. Technically, an empty tree is only valid for the HDIST tree and
+	// not the HCLEN and HLIT tree. However, a stream with an empty HCLEN tree
+	// is guaranteed to fail since it will attempt to use the tree to decode the
+	// codes for the HLIT and HDIST trees. Similarly, an empty HLIT tree is
+	// guaranteed to fail later since the compressed data section must be
+	// composed of at least one symbol (the end-of-block marker).
+	if max == 0 {
+		return true
+	}
+
+	code := 0
+	var nextcode [maxCodeLen]int
+	for i := min; i <= max; i++ {
+		code <<= 1
+		nextcode[i] = code
+		code += count[i]
+	}
+
+	// Check that the coding is complete (i.e., that we've
+	// assigned all 2-to-the-max possible bit sequences).
+	// Exception: To be compatible with zlib, we also need to
+	// accept degenerate single-code codings. See also
+	// TestDegenerateHuffmanCoding.
+	if code != 1<<uint(max) && !(code == 1 && max == 1) {
+		return false
+	}
+
+	h.min = min
+	if max > huffmanChunkBits {
+		numLinks := 1 << (uint(max) - huffmanChunkBits)
+		h.linkMask = uint32(numLinks - 1)
+
+		// create link tables
+		link := nextcode[huffmanChunkBits+1] >> 1
+		h.links = make([][]uint32, huffmanNumChunks-link)
+		for j := uint(link); j < huffmanNumChunks; j++ {
+			reverse := int(reverseByte[j>>8]) | int(reverseByte[j&0xff])<<8
+			reverse >>= uint(16 - huffmanChunkBits)
+			off := j - uint(link)
+			if sanity && h.chunks[reverse] != 0 {
+				panic("impossible: overwriting existing chunk")
+			}
+			h.chunks[reverse] = uint32(off<<huffmanValueShift | (huffmanChunkBits + 1))
+			h.links[off] = make([]uint32, numLinks)
+		}
+	}
+
+	for i, n := range bits {
+		if n == 0 {
+			continue
+		}
+		code := nextcode[n]
+		nextcode[n]++
+		chunk := uint32(i<<huffmanValueShift | n)
+		reverse := int(reverseByte[code>>8]) | int(reverseByte[code&0xff])<<8
+		reverse >>= uint(16 - n)
+		if n <= huffmanChunkBits {
+			for off := reverse; off < len(h.chunks); off += 1 << uint(n) {
+				// We should never need to overwrite
+				// an existing chunk. Also, 0 is
+				// never a valid chunk, because the
+				// lower 4 "count" bits should be
+				// between 1 and 15.
+				if sanity && h.chunks[off] != 0 {
+					panic("impossible: overwriting existing chunk")
+				}
+				h.chunks[off] = chunk
+			}
+		} else {
+			j := reverse & (huffmanNumChunks - 1)
+			if sanity && h.chunks[j]&huffmanCountMask != huffmanChunkBits+1 {
+				// Longer codes should have been
+				// associated with a link table above.
+				panic("impossible: not an indirect chunk")
+			}
+			value := h.chunks[j] >> huffmanValueShift
+			linktab := h.links[value]
+			reverse >>= huffmanChunkBits
+			for off := reverse; off < len(linktab); off += 1 << uint(n-huffmanChunkBits) {
+				if sanity && linktab[off] != 0 {
+					panic("impossible: overwriting existing chunk")
+				}
+				linktab[off] = chunk
+			}
+		}
+	}
+
+	if sanity {
+		// Above we've sanity checked that we never overwrote
+		// an existing entry. Here we additionally check that
+		// we filled the tables completely.
+		for i, chunk := range h.chunks {
+			if chunk == 0 {
+				// As an exception, in the degenerate
+				// single-code case, we allow odd
+				// chunks to be missing.
+				if code == 1 && i%2 == 1 {
+					continue
+				}
+				panic("impossible: missing chunk")
+			}
+		}
+		for _, linktab := range h.links {
+			for _, chunk := range linktab {
+				if chunk == 0 {
+					panic("impossible: missing chunk")
+				}
+			}
+		}
+	}
+
+	return true
+}
+
+// The actual read interface needed by NewReader.
+// If the passed in io.Reader does not also have ReadByte,
+// the NewReader will introduce its own buffering.
+type Reader interface {
+	io.Reader
+	io.ByteReader
+}
+
+// Decompress state.
+type decompressor struct {
+	// Input source.
+	r       Reader
+	roffset int64
+
+	// Input bits, in top of b.
+	b  uint32
+	nb uint
+
+	// Huffman decoders for literal/length, distance.
+	h1, h2 huffmanDecoder
+
+	// Length arrays used to define Huffman codes.
+	bits     *[maxNumLit + maxNumDist]int
+	codebits *[numCodes]int
+
+	// Output history, buffer.
+	dict dictDecoder
+
+	// Temporary buffer (avoids repeated allocation).
+	buf [4]byte
+
+	// Next step in the decompression,
+	// and decompression state.
+	step      func(*decompressor)
+	stepState int
+	final     bool
+	err       error
+	toRead    []byte
+	hl, hd    *huffmanDecoder
+	copyLen   int
+	copyDist  int
+}
+
+func (f *decompressor) nextBlock() {
+	for f.nb < 1+2 {
+		if f.err = f.moreBits(); f.err != nil {
+			return
+		}
+	}
+	f.final = f.b&1 == 1
+	f.b >>= 1
+	typ := f.b & 3
+	f.b >>= 2
+	f.nb -= 1 + 2
+	switch typ {
+	case 0:
+		f.dataBlock()
+	case 1:
+		// compressed, fixed Huffman tables
+		f.hl = &fixedHuffmanDecoder
+		f.hd = nil
+		f.huffmanBlock()
+	case 2:
+		// compressed, dynamic Huffman tables
+		if f.err = f.readHuffman(); f.err != nil {
+			break
+		}
+		f.hl = &f.h1
+		f.hd = &f.h2
+		f.huffmanBlock()
+	default:
+		// 3 is reserved.
+		f.err = CorruptInputError(f.roffset)
+	}
+}
+
+func (f *decompressor) Read(b []byte) (int, error) {
+	for {
+		if len(f.toRead) > 0 {
+			n := copy(b, f.toRead)
+			f.toRead = f.toRead[n:]
+			if len(f.toRead) == 0 {
+				return n, f.err
+			}
+			return n, nil
+		}
+		if f.err != nil {
+			return 0, f.err
+		}
+		f.step(f)
+		if f.err != nil && len(f.toRead) == 0 {
+			f.toRead = f.dict.readFlush() // Flush what's left in case of error
+		}
+	}
+}
+
+// Support the io.WriteTo interface for io.Copy and friends.
+func (f *decompressor) WriteTo(w io.Writer) (int64, error) {
+	total := int64(0)
+	flushed := false
+	for {
+		if len(f.toRead) > 0 {
+			n, err := w.Write(f.toRead)
+			total += int64(n)
+			if err != nil {
+				f.err = err
+				return total, err
+			}
+			if n != len(f.toRead) {
+				return total, io.ErrShortWrite
+			}
+			f.toRead = f.toRead[:0]
+		}
+		if f.err != nil && flushed {
+			if f.err == io.EOF {
+				return total, nil
+			}
+			return total, f.err
+		}
+		if f.err == nil {
+			f.step(f)
+		}
+		if len(f.toRead) == 0 && f.err != nil && !flushed {
+			f.toRead = f.dict.readFlush() // Flush what's left in case of error
+			flushed = true
+		}
+	}
+}
+
+func (f *decompressor) Close() error {
+	if f.err == io.EOF {
+		return nil
+	}
+	return f.err
+}
+
+// RFC 1951 section 3.2.7.
+// Compression with dynamic Huffman codes
+
+var codeOrder = [...]int{16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15}
+
+func (f *decompressor) readHuffman() error {
+	// HLIT[5], HDIST[5], HCLEN[4].
+	for f.nb < 5+5+4 {
+		if err := f.moreBits(); err != nil {
+			return err
+		}
+	}
+	nlit := int(f.b&0x1F) + 257
+	if nlit > maxNumLit {
+		return CorruptInputError(f.roffset)
+	}
+	f.b >>= 5
+	ndist := int(f.b&0x1F) + 1
+	if ndist > maxNumDist {
+		return CorruptInputError(f.roffset)
+	}
+	f.b >>= 5
+	nclen := int(f.b&0xF) + 4
+	// numCodes is 19, so nclen is always valid.
+	f.b >>= 4
+	f.nb -= 5 + 5 + 4
+
+	// (HCLEN+4)*3 bits: code lengths in the magic codeOrder order.
+	for i := 0; i < nclen; i++ {
+		for f.nb < 3 {
+			if err := f.moreBits(); err != nil {
+				return err
+			}
+		}
+		f.codebits[codeOrder[i]] = int(f.b & 0x7)
+		f.b >>= 3
+		f.nb -= 3
+	}
+	for i := nclen; i < len(codeOrder); i++ {
+		f.codebits[codeOrder[i]] = 0
+	}
+	if !f.h1.init(f.codebits[0:]) {
+		return CorruptInputError(f.roffset)
+	}
+
+	// HLIT + 257 code lengths, HDIST + 1 code lengths,
+	// using the code length Huffman code.
+	for i, n := 0, nlit+ndist; i < n; {
+		x, err := f.huffSym(&f.h1)
+		if err != nil {
+			return err
+		}
+		if x < 16 {
+			// Actual length.
+			f.bits[i] = x
+			i++
+			continue
+		}
+		// Repeat previous length or zero.
+		var rep int
+		var nb uint
+		var b int
+		switch x {
+		default:
+			return InternalError("unexpected length code")
+		case 16:
+			rep = 3
+			nb = 2
+			if i == 0 {
+				return CorruptInputError(f.roffset)
+			}
+			b = f.bits[i-1]
+		case 17:
+			rep = 3
+			nb = 3
+			b = 0
+		case 18:
+			rep = 11
+			nb = 7
+			b = 0
+		}
+		for f.nb < nb {
+			if err := f.moreBits(); err != nil {
+				return err
+			}
+		}
+		rep += int(f.b & uint32(1<<nb-1))
+		f.b >>= nb
+		f.nb -= nb
+		if i+rep > n {
+			return CorruptInputError(f.roffset)
+		}
+		for j := 0; j < rep; j++ {
+			f.bits[i] = b
+			i++
+		}
+	}
+
+	if !f.h1.init(f.bits[0:nlit]) || !f.h2.init(f.bits[nlit:nlit+ndist]) {
+		return CorruptInputError(f.roffset)
+	}
+
+	// As an optimization, we can initialize the min bits to read at a time
+	// for the HLIT tree to the length of the EOB marker since we know that
+	// every block must terminate with one. This preserves the property that
+	// we never read any extra bytes after the end of the DEFLATE stream.
+	if f.h1.min < f.bits[endBlockMarker] {
+		f.h1.min = f.bits[endBlockMarker]
+	}
+
+	return nil
+}
+
+// Decode a single Huffman block from f.
+// hl and hd are the Huffman states for the lit/length values
+// and the distance values, respectively. If hd == nil, using the
+// fixed distance encoding associated with fixed Huffman blocks.
+func (f *decompressor) huffmanBlock() {
+	const (
+		stateInit = iota // Zero value must be stateInit
+		stateDict
+	)
+
+	switch f.stepState {
+	case stateInit:
+		goto readLiteral
+	case stateDict:
+		goto copyHistory
+	}
+
+readLiteral:
+	// Read literal and/or (length, distance) according to RFC section 3.2.3.
+	{
+		v, err := f.huffSym(f.hl)
+		if err != nil {
+			f.err = err
+			return
+		}
+		var n uint // number of bits extra
+		var length int
+		switch {
+		case v < 256:
+			f.dict.writeByte(byte(v))
+			if f.dict.availWrite() == 0 {
+				f.toRead = f.dict.readFlush()
+				f.step = (*decompressor).huffmanBlock
+				f.stepState = stateInit
+				return
+			}
+			goto readLiteral
+		case v == 256:
+			f.finishBlock()
+			return
+		// otherwise, reference to older data
+		case v < 265:
+			length = v - (257 - 3)
+			n = 0
+		case v < 269:
+			length = v*2 - (265*2 - 11)
+			n = 1
+		case v < 273:
+			length = v*4 - (269*4 - 19)
+			n = 2
+		case v < 277:
+			length = v*8 - (273*8 - 35)
+			n = 3
+		case v < 281:
+			length = v*16 - (277*16 - 67)
+			n = 4
+		case v < 285:
+			length = v*32 - (281*32 - 131)
+			n = 5
+		case v < maxNumLit:
+			length = 258
+			n = 0
+		default:
+			f.err = CorruptInputError(f.roffset)
+			return
+		}
+		if n > 0 {
+			for f.nb < n {
+				if err = f.moreBits(); err != nil {
+					f.err = err
+					return
+				}
+			}
+			length += int(f.b & uint32(1<<n-1))
+			f.b >>= n
+			f.nb -= n
+		}
+
+		var dist int
+		if f.hd == nil {
+			for f.nb < 5 {
+				if err = f.moreBits(); err != nil {
+					f.err = err
+					return
+				}
+			}
+			dist = int(reverseByte[(f.b&0x1F)<<3])
+			f.b >>= 5
+			f.nb -= 5
+		} else {
+			if dist, err = f.huffSym(f.hd); err != nil {
+				f.err = err
+				return
+			}
+		}
+
+		switch {
+		case dist < 4:
+			dist++
+		case dist < maxNumDist:
+			nb := uint(dist-2) >> 1
+			// have 1 bit in bottom of dist, need nb more.
+			extra := (dist & 1) << nb
+			for f.nb < nb {
+				if err = f.moreBits(); err != nil {
+					f.err = err
+					return
+				}
+			}
+			extra |= int(f.b & uint32(1<<nb-1))
+			f.b >>= nb
+			f.nb -= nb
+			dist = 1<<(nb+1) + 1 + extra
+		default:
+			f.err = CorruptInputError(f.roffset)
+			return
+		}
+
+		// No check on length; encoding can be prescient.
+		if dist > f.dict.histSize() {
+			f.err = CorruptInputError(f.roffset)
+			return
+		}
+
+		f.copyLen, f.copyDist = length, dist
+		goto copyHistory
+	}
+
+copyHistory:
+	// Perform a backwards copy according to RFC section 3.2.3.
+	{
+		cnt := f.dict.tryWriteCopy(f.copyDist, f.copyLen)
+		if cnt == 0 {
+			cnt = f.dict.writeCopy(f.copyDist, f.copyLen)
+		}
+		f.copyLen -= cnt
+
+		if f.dict.availWrite() == 0 || f.copyLen > 0 {
+			f.toRead = f.dict.readFlush()
+			f.step = (*decompressor).huffmanBlock // We need to continue this work
+			f.stepState = stateDict
+			return
+		}
+		goto readLiteral
+	}
+}
+
+// Copy a single uncompressed data block from input to output.
+func (f *decompressor) dataBlock() {
+	// Uncompressed.
+	// Discard current half-byte.
+	f.nb = 0
+	f.b = 0
+
+	// Length then ones-complement of length.
+	nr, err := io.ReadFull(f.r, f.buf[0:4])
+	f.roffset += int64(nr)
+	if err != nil {
+		if err == io.EOF {
+			err = io.ErrUnexpectedEOF
+		}
+		f.err = err
+		return
+	}
+	n := int(f.buf[0]) | int(f.buf[1])<<8
+	nn := int(f.buf[2]) | int(f.buf[3])<<8
+	if uint16(nn) != uint16(^n) {
+		f.err = CorruptInputError(f.roffset)
+		return
+	}
+
+	if n == 0 {
+		f.toRead = f.dict.readFlush()
+		f.finishBlock()
+		return
+	}
+
+	f.copyLen = n
+	f.copyData()
+}
+
+// copyData copies f.copyLen bytes from the underlying reader into f.hist.
+// It pauses for reads when f.hist is full.
+func (f *decompressor) copyData() {
+	buf := f.dict.writeSlice()
+	if len(buf) > f.copyLen {
+		buf = buf[:f.copyLen]
+	}
+
+	cnt, err := io.ReadFull(f.r, buf)
+	f.roffset += int64(cnt)
+	f.copyLen -= cnt
+	f.dict.writeMark(cnt)
+	if err != nil {
+		if err == io.EOF {
+			err = io.ErrUnexpectedEOF
+		}
+		f.err = err
+		return
+	}
+
+	if f.dict.availWrite() == 0 || f.copyLen > 0 {
+		f.toRead = f.dict.readFlush()
+		f.step = (*decompressor).copyData
+		return
+	}
+	f.finishBlock()
+}
+
+func (f *decompressor) finishBlock() {
+	if f.final {
+		if f.dict.availRead() > 0 {
+			f.toRead = f.dict.readFlush()
+		}
+		f.err = io.EOF
+	}
+	f.step = (*decompressor).nextBlock
+}
+
+func (f *decompressor) moreBits() error {
+	c, err := f.r.ReadByte()
+	if err != nil {
+		if err == io.EOF {
+			err = io.ErrUnexpectedEOF
+		}
+		return err
+	}
+	f.roffset++
+	f.b |= uint32(c) << f.nb
+	f.nb += 8
+	return nil
+}
+
+// Read the next Huffman-encoded symbol from f according to h.
+func (f *decompressor) huffSym(h *huffmanDecoder) (int, error) {
+	// Since a huffmanDecoder can be empty or be composed of a degenerate tree
+	// with single element, huffSym must error on these two edge cases. In both
+	// cases, the chunks slice will be 0 for the invalid sequence, leading it
+	// satisfy the n == 0 check below.
+	n := uint(h.min)
+	for {
+		for f.nb < n {
+			if err := f.moreBits(); err != nil {
+				return 0, err
+			}
+		}
+		chunk := h.chunks[f.b&(huffmanNumChunks-1)]
+		n = uint(chunk & huffmanCountMask)
+		if n > huffmanChunkBits {
+			chunk = h.links[chunk>>huffmanValueShift][(f.b>>huffmanChunkBits)&h.linkMask]
+			n = uint(chunk & huffmanCountMask)
+		}
+		if n <= f.nb {
+			if n == 0 {
+				f.err = CorruptInputError(f.roffset)
+				return 0, f.err
+			}
+			f.b >>= n
+			f.nb -= n
+			return int(chunk >> huffmanValueShift), nil
+		}
+	}
+}
+
+func makeReader(r io.Reader) Reader {
+	if rr, ok := r.(Reader); ok {
+		return rr
+	}
+	return bufio.NewReader(r)
+}
+
+func fixedHuffmanDecoderInit() {
+	fixedOnce.Do(func() {
+		// These come from the RFC section 3.2.6.
+		var bits [288]int
+		for i := 0; i < 144; i++ {
+			bits[i] = 8
+		}
+		for i := 144; i < 256; i++ {
+			bits[i] = 9
+		}
+		for i := 256; i < 280; i++ {
+			bits[i] = 7
+		}
+		for i := 280; i < 288; i++ {
+			bits[i] = 8
+		}
+		fixedHuffmanDecoder.init(bits[:])
+	})
+}
+
+func (f *decompressor) Reset(r io.Reader, dict []byte) error {
+	*f = decompressor{
+		r:        makeReader(r),
+		bits:     f.bits,
+		codebits: f.codebits,
+		dict:     f.dict,
+		step:     (*decompressor).nextBlock,
+	}
+	f.dict.init(maxMatchOffset, dict)
+	return nil
+}
+
+// NewReader returns a new ReadCloser that can be used
+// to read the uncompressed version of r.
+// If r does not also implement io.ByteReader,
+// the decompressor may read more data than necessary from r.
+// It is the caller's responsibility to call Close on the ReadCloser
+// when finished reading.
+//
+// The ReadCloser returned by NewReader also implements Resetter.
+func NewReader(r io.Reader) io.ReadCloser {
+	fixedHuffmanDecoderInit()
+
+	var f decompressor
+	f.r = makeReader(r)
+	f.bits = new([maxNumLit + maxNumDist]int)
+	f.codebits = new([numCodes]int)
+	f.step = (*decompressor).nextBlock
+	f.dict.init(maxMatchOffset, nil)
+	return &f
+}
+
+// NewReaderDict is like NewReader but initializes the reader
+// with a preset dictionary. The returned Reader behaves as if
+// the uncompressed data stream started with the given dictionary,
+// which has already been read. NewReaderDict is typically used
+// to read data compressed by NewWriterDict.
+//
+// The ReadCloser returned by NewReader also implements Resetter.
+func NewReaderDict(r io.Reader, dict []byte) io.ReadCloser {
+	fixedHuffmanDecoderInit()
+
+	var f decompressor
+	f.r = makeReader(r)
+	f.bits = new([maxNumLit + maxNumDist]int)
+	f.codebits = new([numCodes]int)
+	f.step = (*decompressor).nextBlock
+	f.dict.init(maxMatchOffset, dict)
+	return &f
+}
--- a/vendor/github.com/klauspost/compress/flate/inflate_test.go
+++ b/vendor/github.com/klauspost/compress/flate/inflate_test.go
@@ -0,0 +1,282 @@
+// Copyright 2014 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import (
+	"bytes"
+	"crypto/rand"
+	"io"
+	"io/ioutil"
+	"strconv"
+	"strings"
+	"testing"
+)
+
+func TestReset(t *testing.T) {
+	ss := []string{
+		"lorem ipsum izzle fo rizzle",
+		"the quick brown fox jumped over",
+	}
+
+	deflated := make([]bytes.Buffer, 2)
+	for i, s := range ss {
+		w, _ := NewWriter(&deflated[i], 1)
+		w.Write([]byte(s))
+		w.Close()
+	}
+
+	inflated := make([]bytes.Buffer, 2)
+
+	f := NewReader(&deflated[0])
+	io.Copy(&inflated[0], f)
+	f.(Resetter).Reset(&deflated[1], nil)
+	io.Copy(&inflated[1], f)
+	f.Close()
+
+	for i, s := range ss {
+		if s != inflated[i].String() {
+			t.Errorf("inflated[%d]:\ngot  %q\nwant %q", i, inflated[i], s)
+		}
+	}
+}
+
+func TestReaderTruncated(t *testing.T) {
+	vectors := []struct{ input, output string }{
+		{"\x00", ""},
+		{"\x00\f", ""},
+		{"\x00\f\x00", ""},
+		{"\x00\f\x00\xf3\xff", ""},
+		{"\x00\f\x00\xf3\xffhello", "hello"},
+		{"\x00\f\x00\xf3\xffhello, world", "hello, world"},
+		{"\x02", ""},
+		{"\xf2H\xcd", "He"},
+		{"\xf2H͙0a\u0084\t", "Hel\x90\x90\x90\x90\x90"},
+		{"\xf2H͙0a\u0084\t\x00", "Hel\x90\x90\x90\x90\x90"},
+	}
+
+	for i, v := range vectors {
+		r := strings.NewReader(v.input)
+		zr := NewReader(r)
+		b, err := ioutil.ReadAll(zr)
+		if err != io.ErrUnexpectedEOF {
+			t.Errorf("test %d, error mismatch: got %v, want io.ErrUnexpectedEOF", i, err)
+		}
+		if string(b) != v.output {
+			t.Errorf("test %d, output mismatch: got %q, want %q", i, b, v.output)
+		}
+	}
+}
+
+func TestResetDict(t *testing.T) {
+	dict := []byte("the lorem fox")
+	ss := []string{
+		"lorem ipsum izzle fo rizzle",
+		"the quick brown fox jumped over",
+	}
+
+	deflated := make([]bytes.Buffer, len(ss))
+	for i, s := range ss {
+		w, _ := NewWriterDict(&deflated[i], DefaultCompression, dict)
+		w.Write([]byte(s))
+		w.Close()
+	}
+
+	inflated := make([]bytes.Buffer, len(ss))
+
+	f := NewReader(nil)
+	for i := range inflated {
+		f.(Resetter).Reset(&deflated[i], dict)
+		io.Copy(&inflated[i], f)
+	}
+	f.Close()
+
+	for i, s := range ss {
+		if s != inflated[i].String() {
+			t.Errorf("inflated[%d]:\ngot  %q\nwant %q", i, inflated[i], s)
+		}
+	}
+}
+
+// Tests ported from zlib/test/infcover.c
+type infTest struct {
+	hex string
+	id  string
+	n   int
+}
+
+var infTests = []infTest{
+	{"0 0 0 0 0", "invalid stored block lengths", 1},
+	{"3 0", "fixed", 0},
+	{"6", "invalid block type", 1},
+	{"1 1 0 fe ff 0", "stored", 0},
+	{"fc 0 0", "too many length or distance symbols", 1},
+	{"4 0 fe ff", "invalid code lengths set", 1},
+	{"4 0 24 49 0", "invalid bit length repeat", 1},
+	{"4 0 24 e9 ff ff", "invalid bit length repeat", 1},
+	{"4 0 24 e9 ff 6d", "invalid code -- missing end-of-block", 1},
+	{"4 80 49 92 24 49 92 24 71 ff ff 93 11 0", "invalid literal/lengths set", 1},
+	{"4 80 49 92 24 49 92 24 f b4 ff ff c3 84", "invalid distances set", 1},
+	{"4 c0 81 8 0 0 0 0 20 7f eb b 0 0", "invalid literal/length code", 1},
+	{"2 7e ff ff", "invalid distance code", 1},
+	{"c c0 81 0 0 0 0 0 90 ff 6b 4 0", "invalid distance too far back", 1},
+
+	// also trailer mismatch just in inflate()
+	{"1f 8b 8 0 0 0 0 0 0 0 3 0 0 0 0 1", "incorrect data check", -1},
+	{"1f 8b 8 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 1", "incorrect length check", -1},
+	{"5 c0 21 d 0 0 0 80 b0 fe 6d 2f 91 6c", "pull 17", 0},
+	{"5 e0 81 91 24 cb b2 2c 49 e2 f 2e 8b 9a 47 56 9f fb fe ec d2 ff 1f", "long code", 0},
+	{"ed c0 1 1 0 0 0 40 20 ff 57 1b 42 2c 4f", "length extra", 0},
+	{"ed cf c1 b1 2c 47 10 c4 30 fa 6f 35 1d 1 82 59 3d fb be 2e 2a fc f c", "long distance and extra", 0},
+	{"ed c0 81 0 0 0 0 80 a0 fd a9 17 a9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6", "window end", 0},
+}
+
+func TestInflate(t *testing.T) {
+	for _, test := range infTests {
+		hex := strings.Split(test.hex, " ")
+		data := make([]byte, len(hex))
+		for i, h := range hex {
+			b, _ := strconv.ParseInt(h, 16, 32)
+			data[i] = byte(b)
+		}
+		buf := bytes.NewReader(data)
+		r := NewReader(buf)
+
+		_, err := io.Copy(ioutil.Discard, r)
+		if (test.n == 0 && err == nil) || (test.n != 0 && err != nil) {
+			t.Logf("%q: OK:", test.id)
+			t.Logf(" - got %v", err)
+			continue
+		}
+
+		if test.n == 0 && err != nil {
+			t.Errorf("%q: Expected no error, but got %v", test.id, err)
+			continue
+		}
+
+		if test.n != 0 && err == nil {
+			t.Errorf("%q:Expected an error, but got none", test.id)
+			continue
+		}
+		t.Fatal(test.n, err)
+	}
+
+	for _, test := range infOutTests {
+		hex := strings.Split(test.hex, " ")
+		data := make([]byte, len(hex))
+		for i, h := range hex {
+			b, _ := strconv.ParseInt(h, 16, 32)
+			data[i] = byte(b)
+		}
+		buf := bytes.NewReader(data)
+		r := NewReader(buf)
+
+		_, err := io.Copy(ioutil.Discard, r)
+		if test.err == (err != nil) {
+			t.Logf("%q: OK:", test.id)
+			t.Logf(" - got %v", err)
+			continue
+		}
+
+		if test.err == false && err != nil {
+			t.Errorf("%q: Expected no error, but got %v", test.id, err)
+			continue
+		}
+
+		if test.err && err == nil {
+			t.Errorf("%q: Expected an error, but got none", test.id)
+			continue
+		}
+		t.Fatal(test.err, err)
+	}
+
+}
+
+// Tests ported from zlib/test/infcover.c
+// Since zlib inflate is push (writer) instead of pull (reader)
+// some of the window size tests have been removed, since they
+// are irrelevant.
+type infOutTest struct {
+	hex    string
+	id     string
+	step   int
+	win    int
+	length int
+	err    bool
+}
+
+var infOutTests = []infOutTest{
+	{"2 8 20 80 0 3 0", "inflate_fast TYPE return", 0, -15, 258, false},
+	{"63 18 5 40 c 0", "window wrap", 3, -8, 300, false},
+	{"e5 e0 81 ad 6d cb b2 2c c9 01 1e 59 63 ae 7d ee fb 4d fd b5 35 41 68 ff 7f 0f 0 0 0", "fast length extra bits", 0, -8, 258, true},
+	{"25 fd 81 b5 6d 59 b6 6a 49 ea af 35 6 34 eb 8c b9 f6 b9 1e ef 67 49 50 fe ff ff 3f 0 0", "fast distance extra bits", 0, -8, 258, true},
+	{"3 7e 0 0 0 0 0", "fast invalid distance code", 0, -8, 258, true},
+	{"1b 7 0 0 0 0 0", "fast invalid literal/length code", 0, -8, 258, true},
+	{"d c7 1 ae eb 38 c 4 41 a0 87 72 de df fb 1f b8 36 b1 38 5d ff ff 0", "fast 2nd level codes and too far back", 0, -8, 258, true},
+	{"63 18 5 8c 10 8 0 0 0 0", "very common case", 0, -8, 259, false},
+	{"63 60 60 18 c9 0 8 18 18 18 26 c0 28 0 29 0 0 0", "contiguous and wrap around window", 6, -8, 259, false},
+	{"63 0 3 0 0 0 0 0", "copy direct from output", 0, -8, 259, false},
+	{"1f 8b 0 0", "bad gzip method", 0, 31, 0, true},
+	{"1f 8b 8 80", "bad gzip flags", 0, 31, 0, true},
+	{"77 85", "bad zlib method", 0, 15, 0, true},
+	{"78 9c", "bad zlib window size", 0, 8, 0, true},
+	{"1f 8b 8 1e 0 0 0 0 0 0 1 0 0 0 0 0 0", "bad header crc", 0, 47, 1, true},
+	{"1f 8b 8 2 0 0 0 0 0 0 1d 26 3 0 0 0 0 0 0 0 0 0", "check gzip length", 0, 47, 0, true},
+	{"78 90", "bad zlib header check", 0, 47, 0, true},
+	{"8 b8 0 0 0 1", "need dictionary", 0, 8, 0, true},
+	{"63 18 68 30 d0 0 0", "force split window update", 4, -8, 259, false},
+	{"3 0", "use fixed blocks", 0, -15, 1, false},
+	{"", "bad window size", 0, 1, 0, true},
+}
+
+func TestWriteTo(t *testing.T) {
+	input := make([]byte, 100000)
+	n, err := rand.Read(input)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if n != len(input) {
+		t.Fatal("did not fill buffer")
+	}
+	compressed := &bytes.Buffer{}
+	w, err := NewWriter(compressed, -2)
+	if err != nil {
+		t.Fatal(err)
+	}
+	n, err = w.Write(input)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if n != len(input) {
+		t.Fatal("did not fill buffer")
+	}
+	w.Close()
+	buf := compressed.Bytes()
+
+	dec := NewReader(bytes.NewBuffer(buf))
+	// ReadAll does not use WriteTo, but we wrap it in a NopCloser to be sure.
+	readall, err := ioutil.ReadAll(ioutil.NopCloser(dec))
+	if err != nil {
+		t.Fatal(err)
+	}
+	if len(readall) != len(input) {
+		t.Fatal("did not decompress everything")
+	}
+
+	dec = NewReader(bytes.NewBuffer(buf))
+	wtbuf := &bytes.Buffer{}
+	written, err := dec.(io.WriterTo).WriteTo(wtbuf)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if written != int64(len(input)) {
+		t.Error("Returned length did not match, expected", len(input), "got", written)
+	}
+	if wtbuf.Len() != len(input) {
+		t.Error("Actual Length did not match, expected", len(input), "got", wtbuf.Len())
+	}
+	if bytes.Compare(wtbuf.Bytes(), input) != 0 {
+		t.Fatal("output did not match input")
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/reader_test.go
+++ b/vendor/github.com/klauspost/compress/flate/reader_test.go
@@ -0,0 +1,97 @@
+// Copyright 2012 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import (
+	"bytes"
+	"io"
+	"io/ioutil"
+	"runtime"
+	"strings"
+	"testing"
+)
+
+func TestNlitOutOfRange(t *testing.T) {
+	// Trying to decode this bogus flate data, which has a Huffman table
+	// with nlit=288, should not panic.
+	io.Copy(ioutil.Discard, NewReader(strings.NewReader(
+		"\xfc\xfe\x36\xe7\x5e\x1c\xef\xb3\x55\x58\x77\xb6\x56\xb5\x43\xf4"+
+			"\x6f\xf2\xd2\xe6\x3d\x99\xa0\x85\x8c\x48\xeb\xf8\xda\x83\x04\x2a"+
+			"\x75\xc4\xf8\x0f\x12\x11\xb9\xb4\x4b\x09\xa0\xbe\x8b\x91\x4c")))
+}
+
+const (
+	digits = iota
+	twain
+)
+
+var testfiles = []string{
+	// Digits is the digits of the irrational number e. Its decimal representation
+	// does not repeat, but there are only 10 possible digits, so it should be
+	// reasonably compressible.
+	digits: "../testdata/e.txt",
+	// Twain is Project Gutenberg's edition of Mark Twain's classic English novel.
+	twain: "../testdata/Mark.Twain-Tom.Sawyer.txt",
+}
+
+func benchmarkDecode(b *testing.B, testfile, level, n int) {
+	b.ReportAllocs()
+	b.StopTimer()
+	b.SetBytes(int64(n))
+	buf0, err := ioutil.ReadFile(testfiles[testfile])
+	if err != nil {
+		b.Fatal(err)
+	}
+	if len(buf0) == 0 {
+		b.Fatalf("test file %q has no data", testfiles[testfile])
+	}
+	compressed := new(bytes.Buffer)
+	w, err := NewWriter(compressed, level)
+	if err != nil {
+		b.Fatal(err)
+	}
+	for i := 0; i < n; i += len(buf0) {
+		if len(buf0) > n-i {
+			buf0 = buf0[:n-i]
+		}
+		io.Copy(w, bytes.NewReader(buf0))
+	}
+	w.Close()
+	buf1 := compressed.Bytes()
+	buf0, compressed, w = nil, nil, nil
+	runtime.GC()
+	b.StartTimer()
+	for i := 0; i < b.N; i++ {
+		io.Copy(ioutil.Discard, NewReader(bytes.NewReader(buf1)))
+	}
+}
+
+// These short names are so that gofmt doesn't break the BenchmarkXxx function
+// bodies below over multiple lines.
+const (
+	constant = ConstantCompression
+	speed    = BestSpeed
+	default_ = DefaultCompression
+	compress = BestCompression
+)
+
+func BenchmarkDecodeDigitsSpeed1e4(b *testing.B)    { benchmarkDecode(b, digits, speed, 1e4) }
+func BenchmarkDecodeDigitsSpeed1e5(b *testing.B)    { benchmarkDecode(b, digits, speed, 1e5) }
+func BenchmarkDecodeDigitsSpeed1e6(b *testing.B)    { benchmarkDecode(b, digits, speed, 1e6) }
+func BenchmarkDecodeDigitsDefault1e4(b *testing.B)  { benchmarkDecode(b, digits, default_, 1e4) }
+func BenchmarkDecodeDigitsDefault1e5(b *testing.B)  { benchmarkDecode(b, digits, default_, 1e5) }
+func BenchmarkDecodeDigitsDefault1e6(b *testing.B)  { benchmarkDecode(b, digits, default_, 1e6) }
+func BenchmarkDecodeDigitsCompress1e4(b *testing.B) { benchmarkDecode(b, digits, compress, 1e4) }
+func BenchmarkDecodeDigitsCompress1e5(b *testing.B) { benchmarkDecode(b, digits, compress, 1e5) }
+func BenchmarkDecodeDigitsCompress1e6(b *testing.B) { benchmarkDecode(b, digits, compress, 1e6) }
+func BenchmarkDecodeTwainSpeed1e4(b *testing.B)     { benchmarkDecode(b, twain, speed, 1e4) }
+func BenchmarkDecodeTwainSpeed1e5(b *testing.B)     { benchmarkDecode(b, twain, speed, 1e5) }
+func BenchmarkDecodeTwainSpeed1e6(b *testing.B)     { benchmarkDecode(b, twain, speed, 1e6) }
+func BenchmarkDecodeTwainDefault1e4(b *testing.B)   { benchmarkDecode(b, twain, default_, 1e4) }
+func BenchmarkDecodeTwainDefault1e5(b *testing.B)   { benchmarkDecode(b, twain, default_, 1e5) }
+func BenchmarkDecodeTwainDefault1e6(b *testing.B)   { benchmarkDecode(b, twain, default_, 1e6) }
+func BenchmarkDecodeTwainCompress1e4(b *testing.B)  { benchmarkDecode(b, twain, compress, 1e4) }
+func BenchmarkDecodeTwainCompress1e5(b *testing.B)  { benchmarkDecode(b, twain, compress, 1e5) }
+func BenchmarkDecodeTwainCompress1e6(b *testing.B)  { benchmarkDecode(b, twain, compress, 1e6) }
--- a/vendor/github.com/klauspost/compress/flate/reverse_bits.go
+++ b/vendor/github.com/klauspost/compress/flate/reverse_bits.go
@@ -0,0 +1,48 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+var reverseByte = [256]byte{
+	0x00, 0x80, 0x40, 0xc0, 0x20, 0xa0, 0x60, 0xe0,
+	0x10, 0x90, 0x50, 0xd0, 0x30, 0xb0, 0x70, 0xf0,
+	0x08, 0x88, 0x48, 0xc8, 0x28, 0xa8, 0x68, 0xe8,
+	0x18, 0x98, 0x58, 0xd8, 0x38, 0xb8, 0x78, 0xf8,
+	0x04, 0x84, 0x44, 0xc4, 0x24, 0xa4, 0x64, 0xe4,
+	0x14, 0x94, 0x54, 0xd4, 0x34, 0xb4, 0x74, 0xf4,
+	0x0c, 0x8c, 0x4c, 0xcc, 0x2c, 0xac, 0x6c, 0xec,
+	0x1c, 0x9c, 0x5c, 0xdc, 0x3c, 0xbc, 0x7c, 0xfc,
+	0x02, 0x82, 0x42, 0xc2, 0x22, 0xa2, 0x62, 0xe2,
+	0x12, 0x92, 0x52, 0xd2, 0x32, 0xb2, 0x72, 0xf2,
+	0x0a, 0x8a, 0x4a, 0xca, 0x2a, 0xaa, 0x6a, 0xea,
+	0x1a, 0x9a, 0x5a, 0xda, 0x3a, 0xba, 0x7a, 0xfa,
+	0x06, 0x86, 0x46, 0xc6, 0x26, 0xa6, 0x66, 0xe6,
+	0x16, 0x96, 0x56, 0xd6, 0x36, 0xb6, 0x76, 0xf6,
+	0x0e, 0x8e, 0x4e, 0xce, 0x2e, 0xae, 0x6e, 0xee,
+	0x1e, 0x9e, 0x5e, 0xde, 0x3e, 0xbe, 0x7e, 0xfe,
+	0x01, 0x81, 0x41, 0xc1, 0x21, 0xa1, 0x61, 0xe1,
+	0x11, 0x91, 0x51, 0xd1, 0x31, 0xb1, 0x71, 0xf1,
+	0x09, 0x89, 0x49, 0xc9, 0x29, 0xa9, 0x69, 0xe9,
+	0x19, 0x99, 0x59, 0xd9, 0x39, 0xb9, 0x79, 0xf9,
+	0x05, 0x85, 0x45, 0xc5, 0x25, 0xa5, 0x65, 0xe5,
+	0x15, 0x95, 0x55, 0xd5, 0x35, 0xb5, 0x75, 0xf5,
+	0x0d, 0x8d, 0x4d, 0xcd, 0x2d, 0xad, 0x6d, 0xed,
+	0x1d, 0x9d, 0x5d, 0xdd, 0x3d, 0xbd, 0x7d, 0xfd,
+	0x03, 0x83, 0x43, 0xc3, 0x23, 0xa3, 0x63, 0xe3,
+	0x13, 0x93, 0x53, 0xd3, 0x33, 0xb3, 0x73, 0xf3,
+	0x0b, 0x8b, 0x4b, 0xcb, 0x2b, 0xab, 0x6b, 0xeb,
+	0x1b, 0x9b, 0x5b, 0xdb, 0x3b, 0xbb, 0x7b, 0xfb,
+	0x07, 0x87, 0x47, 0xc7, 0x27, 0xa7, 0x67, 0xe7,
+	0x17, 0x97, 0x57, 0xd7, 0x37, 0xb7, 0x77, 0xf7,
+	0x0f, 0x8f, 0x4f, 0xcf, 0x2f, 0xaf, 0x6f, 0xef,
+	0x1f, 0x9f, 0x5f, 0xdf, 0x3f, 0xbf, 0x7f, 0xff,
+}
+
+func reverseUint16(v uint16) uint16 {
+	return uint16(reverseByte[v>>8]) | uint16(reverseByte[v&0xFF])<<8
+}
+
+func reverseBits(number uint16, bitLength byte) uint16 {
+	return reverseUint16(number << uint8(16-bitLength))
+}
--- a/vendor/github.com/klauspost/compress/flate/snappy.go
+++ b/vendor/github.com/klauspost/compress/flate/snappy.go
@@ -0,0 +1,900 @@
+// Copyright 2011 The Snappy-Go Authors. All rights reserved.
+// Modified for deflate by Klaus Post (c) 2015.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+// emitLiteral writes a literal chunk and returns the number of bytes written.
+func emitLiteral(dst *tokens, lit []byte) {
+	ol := int(dst.n)
+	for i, v := range lit {
+		dst.tokens[(i+ol)&maxStoreBlockSize] = token(v)
+	}
+	dst.n += uint16(len(lit))
+}
+
+// emitCopy writes a copy chunk and returns the number of bytes written.
+func emitCopy(dst *tokens, offset, length int) {
+	dst.tokens[dst.n] = matchToken(uint32(length-3), uint32(offset-minOffsetSize))
+	dst.n++
+}
+
+type snappyEnc interface {
+	Encode(dst *tokens, src []byte)
+	Reset()
+}
+
+func newSnappy(level int) snappyEnc {
+	switch level {
+	case 1:
+		return &snappyL1{}
+	case 2:
+		return &snappyL2{snappyGen: snappyGen{cur: maxStoreBlockSize, prev: make([]byte, 0, maxStoreBlockSize)}}
+	case 3:
+		return &snappyL3{snappyGen: snappyGen{cur: maxStoreBlockSize, prev: make([]byte, 0, maxStoreBlockSize)}}
+	case 4:
+		return &snappyL4{snappyL3{snappyGen: snappyGen{cur: maxStoreBlockSize, prev: make([]byte, 0, maxStoreBlockSize)}}}
+	default:
+		panic("invalid level specified")
+	}
+}
+
+const (
+	tableBits       = 14             // Bits used in the table
+	tableSize       = 1 << tableBits // Size of the table
+	tableMask       = tableSize - 1  // Mask for table indices. Redundant, but can eliminate bounds checks.
+	tableShift      = 32 - tableBits // Right-shift to get the tableBits most significant bits of a uint32.
+	baseMatchOffset = 1              // The smallest match offset
+	baseMatchLength = 3              // The smallest match length per the RFC section 3.2.5
+	maxMatchOffset  = 1 << 15        // The largest match offset
+)
+
+func load32(b []byte, i int) uint32 {
+	b = b[i : i+4 : len(b)] // Help the compiler eliminate bounds checks on the next line.
+	return uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24
+}
+
+func load64(b []byte, i int) uint64 {
+	b = b[i : i+8 : len(b)] // Help the compiler eliminate bounds checks on the next line.
+	return uint64(b[0]) | uint64(b[1])<<8 | uint64(b[2])<<16 | uint64(b[3])<<24 |
+		uint64(b[4])<<32 | uint64(b[5])<<40 | uint64(b[6])<<48 | uint64(b[7])<<56
+}
+
+func hash(u uint32) uint32 {
+	return (u * 0x1e35a7bd) >> tableShift
+}
+
+// snappyL1 encapsulates level 1 compression
+type snappyL1 struct{}
+
+func (e *snappyL1) Reset() {}
+
+func (e *snappyL1) Encode(dst *tokens, src []byte) {
+	const (
+		inputMargin            = 16 - 1
+		minNonLiteralBlockSize = 1 + 1 + inputMargin
+	)
+
+	// This check isn't in the Snappy implementation, but there, the caller
+	// instead of the callee handles this case.
+	if len(src) < minNonLiteralBlockSize {
+		// We do not fill the token table.
+		// This will be picked up by caller.
+		dst.n = uint16(len(src))
+		return
+	}
+
+	// Initialize the hash table.
+	//
+	// The table element type is uint16, as s < sLimit and sLimit < len(src)
+	// and len(src) <= maxStoreBlockSize and maxStoreBlockSize == 65535.
+	var table [tableSize]uint16
+
+	// sLimit is when to stop looking for offset/length copies. The inputMargin
+	// lets us use a fast path for emitLiteral in the main loop, while we are
+	// looking for copies.
+	sLimit := len(src) - inputMargin
+
+	// nextEmit is where in src the next emitLiteral should start from.
+	nextEmit := 0
+
+	// The encoded form must start with a literal, as there are no previous
+	// bytes to copy, so we start looking for hash matches at s == 1.
+	s := 1
+	nextHash := hash(load32(src, s))
+
+	for {
+		// Copied from the C++ snappy implementation:
+		//
+		// Heuristic match skipping: If 32 bytes are scanned with no matches
+		// found, start looking only at every other byte. If 32 more bytes are
+		// scanned (or skipped), look at every third byte, etc.. When a match
+		// is found, immediately go back to looking at every byte. This is a
+		// small loss (~5% performance, ~0.1% density) for compressible data
+		// due to more bookkeeping, but for non-compressible data (such as
+		// JPEG) it's a huge win since the compressor quickly "realizes" the
+		// data is incompressible and doesn't bother looking for matches
+		// everywhere.
+		//
+		// The "skip" variable keeps track of how many bytes there are since
+		// the last match; dividing it by 32 (ie. right-shifting by five) gives
+		// the number of bytes to move ahead for each iteration.
+		skip := 32
+
+		nextS := s
+		candidate := 0
+		for {
+			s = nextS
+			bytesBetweenHashLookups := skip >> 5
+			nextS = s + bytesBetweenHashLookups
+			skip += bytesBetweenHashLookups
+			if nextS > sLimit {
+				goto emitRemainder
+			}
+			candidate = int(table[nextHash&tableMask])
+			table[nextHash&tableMask] = uint16(s)
+			nextHash = hash(load32(src, nextS))
+			if s-candidate <= maxMatchOffset && load32(src, s) == load32(src, candidate) {
+				break
+			}
+		}
+
+		// A 4-byte match has been found. We'll later see if more than 4 bytes
+		// match. But, prior to the match, src[nextEmit:s] are unmatched. Emit
+		// them as literal bytes.
+		emitLiteral(dst, src[nextEmit:s])
+
+		// Call emitCopy, and then see if another emitCopy could be our next
+		// move. Repeat until we find no match for the input immediately after
+		// what was consumed by the last emitCopy call.
+		//
+		// If we exit this loop normally then we need to call emitLiteral next,
+		// though we don't yet know how big the literal will be. We handle that
+		// by proceeding to the next iteration of the main loop. We also can
+		// exit this loop via goto if we get close to exhausting the input.
+		for {
+			// Invariant: we have a 4-byte match at s, and no need to emit any
+			// literal bytes prior to s.
+			base := s
+
+			// Extend the 4-byte match as long as possible.
+			//
+			// This is an inlined version of Snappy's:
+			//	s = extendMatch(src, candidate+4, s+4)
+			s += 4
+			s1 := base + maxMatchLength
+			if s1 > len(src) {
+				s1 = len(src)
+			}
+			a := src[s:s1]
+			b := src[candidate+4:]
+			b = b[:len(a)]
+			l := len(a)
+			for i := range a {
+				if a[i] != b[i] {
+					l = i
+					break
+				}
+			}
+			s += l
+
+			// matchToken is flate's equivalent of Snappy's emitCopy.
+			dst.tokens[dst.n] = matchToken(uint32(s-base-baseMatchLength), uint32(base-candidate-baseMatchOffset))
+			dst.n++
+			nextEmit = s
+			if s >= sLimit {
+				goto emitRemainder
+			}
+
+			// We could immediately start working at s now, but to improve
+			// compression we first update the hash table at s-1 and at s. If
+			// another emitCopy is not our next move, also calculate nextHash
+			// at s+1. At least on GOARCH=amd64, these three hash calculations
+			// are faster as one load64 call (with some shifts) instead of
+			// three load32 calls.
+			x := load64(src, s-1)
+			prevHash := hash(uint32(x >> 0))
+			table[prevHash&tableMask] = uint16(s - 1)
+			currHash := hash(uint32(x >> 8))
+			candidate = int(table[currHash&tableMask])
+			table[currHash&tableMask] = uint16(s)
+			if s-candidate > maxMatchOffset || uint32(x>>8) != load32(src, candidate) {
+				nextHash = hash(uint32(x >> 16))
+				s++
+				break
+			}
+		}
+	}
+
+emitRemainder:
+	if nextEmit < len(src) {
+		emitLiteral(dst, src[nextEmit:])
+	}
+}
+
+type tableEntry struct {
+	val    uint32
+	offset int32
+}
+
+func load3232(b []byte, i int32) uint32 {
+	b = b[i : i+4 : len(b)] // Help the compiler eliminate bounds checks on the next line.
+	return uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24
+}
+
+func load6432(b []byte, i int32) uint64 {
+	b = b[i : i+8 : len(b)] // Help the compiler eliminate bounds checks on the next line.
+	return uint64(b[0]) | uint64(b[1])<<8 | uint64(b[2])<<16 | uint64(b[3])<<24 |
+		uint64(b[4])<<32 | uint64(b[5])<<40 | uint64(b[6])<<48 | uint64(b[7])<<56
+}
+
+// snappyGen maintains the table for matches,
+// and the previous byte block for level 2.
+// This is the generic implementation.
+type snappyGen struct {
+	prev []byte
+	cur  int32
+}
+
+// snappyGen maintains the table for matches,
+// and the previous byte block for level 2.
+// This is the generic implementation.
+type snappyL2 struct {
+	snappyGen
+	table [tableSize]tableEntry
+}
+
+// EncodeL2 uses a similar algorithm to level 1, but is capable
+// of matching across blocks giving better compression at a small slowdown.
+func (e *snappyL2) Encode(dst *tokens, src []byte) {
+	const (
+		inputMargin            = 8 - 1
+		minNonLiteralBlockSize = 1 + 1 + inputMargin
+	)
+
+	// Protect against e.cur wraparound.
+	if e.cur > 1<<30 {
+		for i := range e.table[:] {
+			e.table[i] = tableEntry{}
+		}
+		e.cur = maxStoreBlockSize
+	}
+
+	// This check isn't in the Snappy implementation, but there, the caller
+	// instead of the callee handles this case.
+	if len(src) < minNonLiteralBlockSize {
+		// We do not fill the token table.
+		// This will be picked up by caller.
+		dst.n = uint16(len(src))
+		e.cur += maxStoreBlockSize
+		e.prev = e.prev[:0]
+		return
+	}
+
+	// sLimit is when to stop looking for offset/length copies. The inputMargin
+	// lets us use a fast path for emitLiteral in the main loop, while we are
+	// looking for copies.
+	sLimit := int32(len(src) - inputMargin)
+
+	// nextEmit is where in src the next emitLiteral should start from.
+	nextEmit := int32(0)
+	s := int32(0)
+	cv := load3232(src, s)
+	nextHash := hash(cv)
+
+	for {
+		// Copied from the C++ snappy implementation:
+		//
+		// Heuristic match skipping: If 32 bytes are scanned with no matches
+		// found, start looking only at every other byte. If 32 more bytes are
+		// scanned (or skipped), look at every third byte, etc.. When a match
+		// is found, immediately go back to looking at every byte. This is a
+		// small loss (~5% performance, ~0.1% density) for compressible data
+		// due to more bookkeeping, but for non-compressible data (such as
+		// JPEG) it's a huge win since the compressor quickly "realizes" the
+		// data is incompressible and doesn't bother looking for matches
+		// everywhere.
+		//
+		// The "skip" variable keeps track of how many bytes there are since
+		// the last match; dividing it by 32 (ie. right-shifting by five) gives
+		// the number of bytes to move ahead for each iteration.
+		skip := int32(32)
+
+		nextS := s
+		var candidate tableEntry
+		for {
+			s = nextS
+			bytesBetweenHashLookups := skip >> 5
+			nextS = s + bytesBetweenHashLookups
+			skip += bytesBetweenHashLookups
+			if nextS > sLimit {
+				goto emitRemainder
+			}
+			candidate = e.table[nextHash&tableMask]
+			now := load3232(src, nextS)
+			e.table[nextHash&tableMask] = tableEntry{offset: s + e.cur, val: cv}
+			nextHash = hash(now)
+
+			offset := s - (candidate.offset - e.cur)
+			if offset > maxMatchOffset || cv != candidate.val {
+				// Out of range or not matched.
+				cv = now
+				continue
+			}
+			break
+		}
+
+		// A 4-byte match has been found. We'll later see if more than 4 bytes
+		// match. But, prior to the match, src[nextEmit:s] are unmatched. Emit
+		// them as literal bytes.
+		emitLiteral(dst, src[nextEmit:s])
+
+		// Call emitCopy, and then see if another emitCopy could be our next
+		// move. Repeat until we find no match for the input immediately after
+		// what was consumed by the last emitCopy call.
+		//
+		// If we exit this loop normally then we need to call emitLiteral next,
+		// though we don't yet know how big the literal will be. We handle that
+		// by proceeding to the next iteration of the main loop. We also can
+		// exit this loop via goto if we get close to exhausting the input.
+		for {
+			// Invariant: we have a 4-byte match at s, and no need to emit any
+			// literal bytes prior to s.
+
+			// Extend the 4-byte match as long as possible.
+			//
+			s += 4
+			t := candidate.offset - e.cur + 4
+			l := e.matchlen(s, t, src)
+
+			// matchToken is flate's equivalent of Snappy's emitCopy. (length,offset)
+			dst.tokens[dst.n] = matchToken(uint32(l+4-baseMatchLength), uint32(s-t-baseMatchOffset))
+			dst.n++
+			s += l
+			nextEmit = s
+			if s >= sLimit {
+				t += l
+				// Index first pair after match end.
+				if int(t+4) < len(src) && t > 0 {
+					cv := load3232(src, t)
+					e.table[hash(cv)&tableMask] = tableEntry{offset: t + e.cur, val: cv}
+				}
+				goto emitRemainder
+			}
+
+			// We could immediately start working at s now, but to improve
+			// compression we first update the hash table at s-1 and at s. If
+			// another emitCopy is not our next move, also calculate nextHash
+			// at s+1. At least on GOARCH=amd64, these three hash calculations
+			// are faster as one load64 call (with some shifts) instead of
+			// three load32 calls.
+			x := load6432(src, s-1)
+			prevHash := hash(uint32(x))
+			e.table[prevHash&tableMask] = tableEntry{offset: e.cur + s - 1, val: uint32(x)}
+			x >>= 8
+			currHash := hash(uint32(x))
+			candidate = e.table[currHash&tableMask]
+			e.table[currHash&tableMask] = tableEntry{offset: e.cur + s, val: uint32(x)}
+
+			offset := s - (candidate.offset - e.cur)
+			if offset > maxMatchOffset || uint32(x) != candidate.val {
+				cv = uint32(x >> 8)
+				nextHash = hash(cv)
+				s++
+				break
+			}
+		}
+	}
+
+emitRemainder:
+	if int(nextEmit) < len(src) {
+		emitLiteral(dst, src[nextEmit:])
+	}
+	e.cur += int32(len(src))
+	e.prev = e.prev[:len(src)]
+	copy(e.prev, src)
+}
+
+type tableEntryPrev struct {
+	Cur  tableEntry
+	Prev tableEntry
+}
+
+// snappyL3
+type snappyL3 struct {
+	snappyGen
+	table [tableSize]tableEntryPrev
+}
+
+// Encode uses a similar algorithm to level 2, will check up to two candidates.
+func (e *snappyL3) Encode(dst *tokens, src []byte) {
+	const (
+		inputMargin            = 8 - 1
+		minNonLiteralBlockSize = 1 + 1 + inputMargin
+	)
+
+	// Protect against e.cur wraparound.
+	if e.cur > 1<<30 {
+		for i := range e.table[:] {
+			e.table[i] = tableEntryPrev{}
+		}
+		e.snappyGen = snappyGen{cur: maxStoreBlockSize, prev: e.prev[:0]}
+	}
+
+	// This check isn't in the Snappy implementation, but there, the caller
+	// instead of the callee handles this case.
+	if len(src) < minNonLiteralBlockSize {
+		// We do not fill the token table.
+		// This will be picked up by caller.
+		dst.n = uint16(len(src))
+		e.cur += maxStoreBlockSize
+		e.prev = e.prev[:0]
+		return
+	}
+
+	// sLimit is when to stop looking for offset/length copies. The inputMargin
+	// lets us use a fast path for emitLiteral in the main loop, while we are
+	// looking for copies.
+	sLimit := int32(len(src) - inputMargin)
+
+	// nextEmit is where in src the next emitLiteral should start from.
+	nextEmit := int32(0)
+	s := int32(0)
+	cv := load3232(src, s)
+	nextHash := hash(cv)
+
+	for {
+		// Copied from the C++ snappy implementation:
+		//
+		// Heuristic match skipping: If 32 bytes are scanned with no matches
+		// found, start looking only at every other byte. If 32 more bytes are
+		// scanned (or skipped), look at every third byte, etc.. When a match
+		// is found, immediately go back to looking at every byte. This is a
+		// small loss (~5% performance, ~0.1% density) for compressible data
+		// due to more bookkeeping, but for non-compressible data (such as
+		// JPEG) it's a huge win since the compressor quickly "realizes" the
+		// data is incompressible and doesn't bother looking for matches
+		// everywhere.
+		//
+		// The "skip" variable keeps track of how many bytes there are since
+		// the last match; dividing it by 32 (ie. right-shifting by five) gives
+		// the number of bytes to move ahead for each iteration.
+		skip := int32(32)
+
+		nextS := s
+		var candidate tableEntry
+		for {
+			s = nextS
+			bytesBetweenHashLookups := skip >> 5
+			nextS = s + bytesBetweenHashLookups
+			skip += bytesBetweenHashLookups
+			if nextS > sLimit {
+				goto emitRemainder
+			}
+			candidates := e.table[nextHash&tableMask]
+			now := load3232(src, nextS)
+			e.table[nextHash&tableMask] = tableEntryPrev{Prev: candidates.Cur, Cur: tableEntry{offset: s + e.cur, val: cv}}
+			nextHash = hash(now)
+
+			// Check both candidates
+			candidate = candidates.Cur
+			if cv == candidate.val {
+				offset := s - (candidate.offset - e.cur)
+				if offset <= maxMatchOffset {
+					break
+				}
+			} else {
+				// We only check if value mismatches.
+				// Offset will always be invalid in other cases.
+				candidate = candidates.Prev
+				if cv == candidate.val {
+					offset := s - (candidate.offset - e.cur)
+					if offset <= maxMatchOffset {
+						break
+					}
+				}
+			}
+			cv = now
+		}
+
+		// A 4-byte match has been found. We'll later see if more than 4 bytes
+		// match. But, prior to the match, src[nextEmit:s] are unmatched. Emit
+		// them as literal bytes.
+		emitLiteral(dst, src[nextEmit:s])
+
+		// Call emitCopy, and then see if another emitCopy could be our next
+		// move. Repeat until we find no match for the input immediately after
+		// what was consumed by the last emitCopy call.
+		//
+		// If we exit this loop normally then we need to call emitLiteral next,
+		// though we don't yet know how big the literal will be. We handle that
+		// by proceeding to the next iteration of the main loop. We also can
+		// exit this loop via goto if we get close to exhausting the input.
+		for {
+			// Invariant: we have a 4-byte match at s, and no need to emit any
+			// literal bytes prior to s.
+
+			// Extend the 4-byte match as long as possible.
+			//
+			s += 4
+			t := candidate.offset - e.cur + 4
+			l := e.matchlen(s, t, src)
+
+			// matchToken is flate's equivalent of Snappy's emitCopy. (length,offset)
+			dst.tokens[dst.n] = matchToken(uint32(l+4-baseMatchLength), uint32(s-t-baseMatchOffset))
+			dst.n++
+			s += l
+			nextEmit = s
+			if s >= sLimit {
+				t += l
+				// Index first pair after match end.
+				if int(t+4) < len(src) && t > 0 {
+					cv := load3232(src, t)
+					nextHash = hash(cv)
+					e.table[nextHash&tableMask] = tableEntryPrev{
+						Prev: e.table[nextHash&tableMask].Cur,
+						Cur:  tableEntry{offset: e.cur + t, val: cv},
+					}
+				}
+				goto emitRemainder
+			}
+
+			// We could immediately start working at s now, but to improve
+			// compression we first update the hash table at s-3 to s. If
+			// another emitCopy is not our next move, also calculate nextHash
+			// at s+1. At least on GOARCH=amd64, these three hash calculations
+			// are faster as one load64 call (with some shifts) instead of
+			// three load32 calls.
+			x := load6432(src, s-3)
+			prevHash := hash(uint32(x))
+			e.table[prevHash&tableMask] = tableEntryPrev{
+				Prev: e.table[prevHash&tableMask].Cur,
+				Cur:  tableEntry{offset: e.cur + s - 3, val: uint32(x)},
+			}
+			x >>= 8
+			prevHash = hash(uint32(x))
+
+			e.table[prevHash&tableMask] = tableEntryPrev{
+				Prev: e.table[prevHash&tableMask].Cur,
+				Cur:  tableEntry{offset: e.cur + s - 2, val: uint32(x)},
+			}
+			x >>= 8
+			prevHash = hash(uint32(x))
+
+			e.table[prevHash&tableMask] = tableEntryPrev{
+				Prev: e.table[prevHash&tableMask].Cur,
+				Cur:  tableEntry{offset: e.cur + s - 1, val: uint32(x)},
+			}
+			x >>= 8
+			currHash := hash(uint32(x))
+			candidates := e.table[currHash&tableMask]
+			cv = uint32(x)
+			e.table[currHash&tableMask] = tableEntryPrev{
+				Prev: candidates.Cur,
+				Cur:  tableEntry{offset: s + e.cur, val: cv},
+			}
+
+			// Check both candidates
+			candidate = candidates.Cur
+			if cv == candidate.val {
+				offset := s - (candidate.offset - e.cur)
+				if offset <= maxMatchOffset {
+					continue
+				}
+			} else {
+				// We only check if value mismatches.
+				// Offset will always be invalid in other cases.
+				candidate = candidates.Prev
+				if cv == candidate.val {
+					offset := s - (candidate.offset - e.cur)
+					if offset <= maxMatchOffset {
+						continue
+					}
+				}
+			}
+			cv = uint32(x >> 8)
+			nextHash = hash(cv)
+			s++
+			break
+		}
+	}
+
+emitRemainder:
+	if int(nextEmit) < len(src) {
+		emitLiteral(dst, src[nextEmit:])
+	}
+	e.cur += int32(len(src))
+	e.prev = e.prev[:len(src)]
+	copy(e.prev, src)
+}
+
+// snappyL4
+type snappyL4 struct {
+	snappyL3
+}
+
+// Encode uses a similar algorithm to level 3,
+// but will check up to two candidates if first isn't long enough.
+func (e *snappyL4) Encode(dst *tokens, src []byte) {
+	const (
+		inputMargin            = 8 - 3
+		minNonLiteralBlockSize = 1 + 1 + inputMargin
+		matchLenGood           = 12
+	)
+
+	// Protect against e.cur wraparound.
+	if e.cur > 1<<30 {
+		for i := range e.table[:] {
+			e.table[i] = tableEntryPrev{}
+		}
+		e.snappyGen = snappyGen{cur: maxStoreBlockSize, prev: e.prev[:0]}
+	}
+
+	// This check isn't in the Snappy implementation, but there, the caller
+	// instead of the callee handles this case.
+	if len(src) < minNonLiteralBlockSize {
+		// We do not fill the token table.
+		// This will be picked up by caller.
+		dst.n = uint16(len(src))
+		e.cur += maxStoreBlockSize
+		e.prev = e.prev[:0]
+		return
+	}
+
+	// sLimit is when to stop looking for offset/length copies. The inputMargin
+	// lets us use a fast path for emitLiteral in the main loop, while we are
+	// looking for copies.
+	sLimit := int32(len(src) - inputMargin)
+
+	// nextEmit is where in src the next emitLiteral should start from.
+	nextEmit := int32(0)
+	s := int32(0)
+	cv := load3232(src, s)
+	nextHash := hash(cv)
+
+	for {
+		// Copied from the C++ snappy implementation:
+		//
+		// Heuristic match skipping: If 32 bytes are scanned with no matches
+		// found, start looking only at every other byte. If 32 more bytes are
+		// scanned (or skipped), look at every third byte, etc.. When a match
+		// is found, immediately go back to looking at every byte. This is a
+		// small loss (~5% performance, ~0.1% density) for compressible data
+		// due to more bookkeeping, but for non-compressible data (such as
+		// JPEG) it's a huge win since the compressor quickly "realizes" the
+		// data is incompressible and doesn't bother looking for matches
+		// everywhere.
+		//
+		// The "skip" variable keeps track of how many bytes there are since
+		// the last match; dividing it by 32 (ie. right-shifting by five) gives
+		// the number of bytes to move ahead for each iteration.
+		skip := int32(32)
+
+		nextS := s
+		var candidate tableEntry
+		var candidateAlt tableEntry
+		for {
+			s = nextS
+			bytesBetweenHashLookups := skip >> 5
+			nextS = s + bytesBetweenHashLookups
+			skip += bytesBetweenHashLookups
+			if nextS > sLimit {
+				goto emitRemainder
+			}
+			candidates := e.table[nextHash&tableMask]
+			now := load3232(src, nextS)
+			e.table[nextHash&tableMask] = tableEntryPrev{Prev: candidates.Cur, Cur: tableEntry{offset: s + e.cur, val: cv}}
+			nextHash = hash(now)
+
+			// Check both candidates
+			candidate = candidates.Cur
+			if cv == candidate.val {
+				offset := s - (candidate.offset - e.cur)
+				if offset < maxMatchOffset {
+					offset = s - (candidates.Prev.offset - e.cur)
+					if cv == candidates.Prev.val && offset < maxMatchOffset {
+						candidateAlt = candidates.Prev
+					}
+					break
+				}
+			} else {
+				// We only check if value mismatches.
+				// Offset will always be invalid in other cases.
+				candidate = candidates.Prev
+				if cv == candidate.val {
+					offset := s - (candidate.offset - e.cur)
+					if offset < maxMatchOffset {
+						break
+					}
+				}
+			}
+			cv = now
+		}
+
+		// A 4-byte match has been found. We'll later see if more than 4 bytes
+		// match. But, prior to the match, src[nextEmit:s] are unmatched. Emit
+		// them as literal bytes.
+		emitLiteral(dst, src[nextEmit:s])
+
+		// Call emitCopy, and then see if another emitCopy could be our next
+		// move. Repeat until we find no match for the input immediately after
+		// what was consumed by the last emitCopy call.
+		//
+		// If we exit this loop normally then we need to call emitLiteral next,
+		// though we don't yet know how big the literal will be. We handle that
+		// by proceeding to the next iteration of the main loop. We also can
+		// exit this loop via goto if we get close to exhausting the input.
+		for {
+			// Invariant: we have a 4-byte match at s, and no need to emit any
+			// literal bytes prior to s.
+
+			// Extend the 4-byte match as long as possible.
+			//
+			s += 4
+			t := candidate.offset - e.cur + 4
+			l := e.matchlen(s, t, src)
+			// Try alternative candidate if match length < matchLenGood.
+			if l < matchLenGood-4 && candidateAlt.offset != 0 {
+				t2 := candidateAlt.offset - e.cur + 4
+				l2 := e.matchlen(s, t2, src)
+				if l2 > l {
+					l = l2
+					t = t2
+				}
+			}
+			// matchToken is flate's equivalent of Snappy's emitCopy. (length,offset)
+			dst.tokens[dst.n] = matchToken(uint32(l+4-baseMatchLength), uint32(s-t-baseMatchOffset))
+			dst.n++
+			s += l
+			nextEmit = s
+			if s >= sLimit {
+				t += l
+				// Index first pair after match end.
+				if int(t+4) < len(src) && t > 0 {
+					cv := load3232(src, t)
+					nextHash = hash(cv)
+					e.table[nextHash&tableMask] = tableEntryPrev{
+						Prev: e.table[nextHash&tableMask].Cur,
+						Cur:  tableEntry{offset: e.cur + t, val: cv},
+					}
+				}
+				goto emitRemainder
+			}
+
+			// We could immediately start working at s now, but to improve
+			// compression we first update the hash table at s-3 to s. If
+			// another emitCopy is not our next move, also calculate nextHash
+			// at s+1. At least on GOARCH=amd64, these three hash calculations
+			// are faster as one load64 call (with some shifts) instead of
+			// three load32 calls.
+			x := load6432(src, s-3)
+			prevHash := hash(uint32(x))
+			e.table[prevHash&tableMask] = tableEntryPrev{
+				Prev: e.table[prevHash&tableMask].Cur,
+				Cur:  tableEntry{offset: e.cur + s - 3, val: uint32(x)},
+			}
+			x >>= 8
+			prevHash = hash(uint32(x))
+
+			e.table[prevHash&tableMask] = tableEntryPrev{
+				Prev: e.table[prevHash&tableMask].Cur,
+				Cur:  tableEntry{offset: e.cur + s - 2, val: uint32(x)},
+			}
+			x >>= 8
+			prevHash = hash(uint32(x))
+
+			e.table[prevHash&tableMask] = tableEntryPrev{
+				Prev: e.table[prevHash&tableMask].Cur,
+				Cur:  tableEntry{offset: e.cur + s - 1, val: uint32(x)},
+			}
+			x >>= 8
+			currHash := hash(uint32(x))
+			candidates := e.table[currHash&tableMask]
+			cv = uint32(x)
+			e.table[currHash&tableMask] = tableEntryPrev{
+				Prev: candidates.Cur,
+				Cur:  tableEntry{offset: s + e.cur, val: cv},
+			}
+
+			// Check both candidates
+			candidate = candidates.Cur
+			candidateAlt = tableEntry{}
+			if cv == candidate.val {
+				offset := s - (candidate.offset - e.cur)
+				if offset <= maxMatchOffset {
+					offset = s - (candidates.Prev.offset - e.cur)
+					if cv == candidates.Prev.val && offset <= maxMatchOffset {
+						candidateAlt = candidates.Prev
+					}
+					continue
+				}
+			} else {
+				// We only check if value mismatches.
+				// Offset will always be invalid in other cases.
+				candidate = candidates.Prev
+				if cv == candidate.val {
+					offset := s - (candidate.offset - e.cur)
+					if offset <= maxMatchOffset {
+						continue
+					}
+				}
+			}
+			cv = uint32(x >> 8)
+			nextHash = hash(cv)
+			s++
+			break
+		}
+	}
+
+emitRemainder:
+	if int(nextEmit) < len(src) {
+		emitLiteral(dst, src[nextEmit:])
+	}
+	e.cur += int32(len(src))
+	e.prev = e.prev[:len(src)]
+	copy(e.prev, src)
+}
+
+func (e *snappyGen) matchlen(s, t int32, src []byte) int32 {
+	s1 := int(s) + maxMatchLength - 4
+	if s1 > len(src) {
+		s1 = len(src)
+	}
+
+	// If we are inside the current block
+	if t >= 0 {
+		b := src[t:]
+		a := src[s:s1]
+		b = b[:len(a)]
+		// Extend the match to be as long as possible.
+		for i := range a {
+			if a[i] != b[i] {
+				return int32(i)
+			}
+		}
+		return int32(len(a))
+	}
+
+	// We found a match in the previous block.
+	tp := int32(len(e.prev)) + t
+	if tp < 0 {
+		return 0
+	}
+
+	// Extend the match to be as long as possible.
+	a := src[s:s1]
+	b := e.prev[tp:]
+	if len(b) > len(a) {
+		b = b[:len(a)]
+	}
+	a = a[:len(b)]
+	for i := range b {
+		if a[i] != b[i] {
+			return int32(i)
+		}
+	}
+
+	// If we reached our limit, we matched everything we are
+	// allowed to in the previous block and we return.
+	n := int32(len(b))
+	if int(s+n) == s1 {
+		return n
+	}
+
+	// Continue looking for more matches in the current block.
+	a = src[s+n : s1]
+	b = src[:len(a)]
+	for i := range a {
+		if a[i] != b[i] {
+			return int32(i) + n
+		}
+	}
+	return int32(len(a)) + n
+}
+
+// Reset the encoding table.
+func (e *snappyGen) Reset() {
+	e.prev = e.prev[:0]
+	e.cur += maxMatchOffset
+}
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.dyn.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.dyn.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.dyn.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.golden
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.in
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.wb.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.wb.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-null-max.wb.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.dyn.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.dyn.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.dyn.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.golden
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.in
@@ -0,0 +1 @@
+3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067982148086513282306647093844609550582231725359408128481117450284102701938521105559644622948954930381964428810975665933446128475648233786783165271201909145648566923460348610454326648213393607260249141273724587006606315588174881520920962829254091715364367892590360011330530548820466521384146951941511609433057270365759591953092186117381932611793105118548074462379962749567351885752724891227938183011949129833673362440656643086021394946395224737190702179860943702770539217176293176752384674818467669405132000568127145263560827785771342757789609173637178721468440901224953430146549585371050792279689258923542019956112129021960864034418159813629774771309960518707211349999998372978049951059731732816096318595024459455346908302642522308253344685035261931188171010003137838752886587533208381420617177669147303598253490428755468731159562863882353787593751957781857780532171226806613001927876611195909216420198938095257201065485863278865936153381827968230301952035301852968995773622599413891249721775283479131515574857242454150695950829533116861727855889075098381754637464939319255060400927701671139009848824012858361603563707660104710181942955596198946767837449448255379774726847104047534646208046684259069491293313677028989152104752162056966024058038150193511253382430035587640247496473263914199272604269922796782354781636009341721641219924586315030286182974555706749838505494588586926995690927210797509302955321165344987202755960236480665499119881834797753566369807426542527862551818417574672890977772793800081647060016145249192173217214772350141441973568548161361157352552133475741849468438523323907394143334547762416862518983569485562099219222184272550254256887671790494601653466804988627232791786085784383827967976681454100953883786360950680064225125205117392984896084128488626945604241965285022210661186306744278622039194945047123713786960956364371917287467764657573962413890865832645995813390478027590099465764078951269468398352595709825822620522489407726719478268482601476990902640136394437455305068203496252451749399651431429809190659250937221696461515709858387410597885959772975498930161753928468138268683868942774155991855925245953959431049972524680845987273644695848653836736222626099124608051243884390451244136549762780797715691435997700129616089441694868555848406353422072225828488648158456028506016842739452267467678895252138522549954666727823986456596116354886230577456498035593634568174324112515076069479451096596094025228879710893145669136867228748940560101503308617928680920874760917824938589009714909675985261365549781893129784821682998948722658804857564014270477555132379641451523746234364542858444795265867821051141354735739523113427166102135969536231442952484937187110145765403590279934403742007310578539062198387447808478489683321445713868751943506430218453191048481005370614680674919278191197939952061419663428754440643745123718192179998391015919561814675142691239748940907186494231961567945208095146550225231603881930142093762137855956638937787083039069792077346722182562599661501421503068038447734549202605414665925201497442850732518666002132434088190710486331734649651453905796268561005508106658796998163574736384052571459102897064140110971206280439039759515677157700420337869936007230558763176359421873125147120532928191826186125867321579198414848829164470609575270695722091756711672291098169091528017350671274858322287183520935396572512108357915136988209144421006751033467110314126711136990865851639831501970165151168517143765761835155650884909989859982387345528331635507647918535893226185489632132933089857064204675259070915481416549859461637180
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.wb.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.wb.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-pi.wb.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.dyn.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.dyn.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.dyn.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.golden
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.in
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.wb.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.wb.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-1k.wb.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.dyn.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.dyn.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.dyn.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.golden
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.in
@@ -0,0 +1,4 @@
+aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+<EFBFBD><EFBFBD><EFBFBD>vH
+<EFBFBD><EFBFBD>%<25><><EFBFBD><EFBFBD><EFBFBD><EFBFBD> <20><17>ɷ<EFBFBD><C9B7><06>}<18><>><07><><EFBFBD>ls<6C><73><EFBFBD>m<EFBFBD>IGH<15><><EFBFBD><EFBFBD>1Y<31>4<EFBFBD>[<5B><>	0[|]o#<23>
+<EFBFBD>-#<23><><EFBFBD>ul<><6C><EFBFBD>pf<70><66>ٱ<EFBFBD>n<EFBFBD>Y<EFBFBD>ԀY<D480>w<EFBFBD>C8ɯ02<30> F=gn<67>r<EFBFBD>N!O<><4F><EFBFBD>{<04><><03><05>k<EFBFBD>*<2A>w(<28><>b<EFBFBD> <20><1F>kQC9/<2F><>lu><3E>5<EFBFBD>C.<2E><>u<EFBFBD><75><EFBFBD>
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.wb.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.wb.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-limit.wb.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-max.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-max.golden
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-max.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-rand-max.in
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.dyn.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.dyn.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.dyn.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.golden
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.in
@@ -0,0 +1,2 @@
+101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010
+232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.wb.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.wb.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-shifts.wb.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.dyn.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.dyn.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.dyn.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.golden
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.in
@@ -0,0 +1,14 @@
+//Copyright2009ThGoAuthor.Allrightrrvd.
+//UofthiourccodigovrndbyBSD-tyl
+//licnthtcnbfoundinthLICENSEfil.
+
+pckgmin
+
+import"o"
+
+funcmin(){
+	vrb=mk([]byt,65535)
+	f,_:=o.Crt("huffmn-null-mx.in")
+	f.Writ(b)
+}
+ABCDEFGHIJKLMNOPQRSTUVXxyz!"#¤%&/?"
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.wb.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.wb.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text-shift.wb.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.dyn.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.dyn.expect
@@ -0,0 +1 @@
+<1C>_K<5F>0<1C><><EFBFBD><05><0E><>`K<><4B>0Aasě)^<5E>H<EFBFBD><48><EFBFBD><EFBFBD><EFBFBD>Iɟb߻<><DFBB>_><3E>4
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.dyn.expect-noinput
@@ -0,0 +1 @@
+<1C>_K<5F>0<1C><><EFBFBD><05><0E><>`K<><4B>0Aasě)^<5E>H<EFBFBD><48><EFBFBD><EFBFBD><EFBFBD>Iɟb߻<><DFBB>_><3E>4
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.golden
@@ -0,0 +1,3 @@
+<04>AK<41>0<07><>x<>ß<EFBFBD>Z<EFBFBD><5A><EFBFBD><EFBFBD>LP<4C>a<EFBFBD>!<21>x<EFBFBD><78>AD<41><44>I<13>&#I<>E<EFBFBD><45><EFBFBD><EFBFBD><06>p]<5D>Lƿ<4C><C6BF><16>F<EFBFBD>p<><70>	1<>88<38>h<07><13>$<24><><EFBFBD>5S<35><53>-	<09>F66!<21>)v<>.<2E><02>0<EFBFBD>Y<EFBFBD><59><1E><02><><EFBFBD><EFBFBD>&<26><>	S<><53><EFBFBD>N|d<>2:<3A><>
+t<EFBFBD>|둍<><EB918D><EFBFBD>xz9<7A><39><EFBFBD><EFBFBD><EFBFBD><EFBFBD>骺<EFBFBD><04><><EFBFBD><EFBFBD>Ɏ<EFBFBD>3<><33>
+&&=<3D><0E><><EFBFBD><EFBFBD><EFBFBD>ô<EFBFBD>UD<55>=Fu<46><75><EFBFBD><EFBFBD>]<5D><>q<EFBFBD><71><EFBFBD><EFBFBD>UL+<2B><><17><><08>>FQY<51><59>LZ<4C><5A>o<EFBFBD><6F><EFBFBD>fTߵ<54>EŴ<45><C5B4>{<7B>Yʶb<CAB6>e<EFBFBD>
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.in
@@ -0,0 +1,13 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package main
+
+import "os"
+
+func main() {
+	var b = make([]byte, 65535)
+	f, _ := os.Create("huffman-null-max.in")
+	f.Write(b)
+}
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.wb.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.wb.expect
@@ -0,0 +1 @@
+<1C>_K<5F>0<1C><><EFBFBD><05><0E><>`K<><4B>0Aasě)^<5E>H<EFBFBD><48><EFBFBD><EFBFBD><EFBFBD>Iɟb߻<><DFBB>_><3E>4
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-text.wb.expect-noinput
@@ -0,0 +1 @@
+<1C>_K<5F>0<1C><><EFBFBD><05><0E><>`K<><4B>0Aasě)^<5E>H<EFBFBD><48><EFBFBD><EFBFBD><EFBFBD>Iɟb߻<><DFBB>_><3E>4
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.dyn.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.dyn.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.dyn.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.golden
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.golden
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.in
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.in
@@ -0,0 +1 @@
+00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.wb.expect
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.wb.expect
--- a/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/huffman-zero.wb.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/null-long-match.dyn.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/null-long-match.dyn.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/testdata/null-long-match.wb.expect-noinput
+++ b/vendor/github.com/klauspost/compress/flate/testdata/null-long-match.wb.expect-noinput
--- a/vendor/github.com/klauspost/compress/flate/token.go
+++ b/vendor/github.com/klauspost/compress/flate/token.go
@@ -0,0 +1,115 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import "fmt"
+
+const (
+	// 2 bits:   type   0 = literal  1=EOF  2=Match   3=Unused
+	// 8 bits:   xlength = length - MIN_MATCH_LENGTH
+	// 22 bits   xoffset = offset - MIN_OFFSET_SIZE, or literal
+	lengthShift = 22
+	offsetMask  = 1<<lengthShift - 1
+	typeMask    = 3 << 30
+	literalType = 0 << 30
+	matchType   = 1 << 30
+)
+
+// The length code for length X (MIN_MATCH_LENGTH <= X <= MAX_MATCH_LENGTH)
+// is lengthCodes[length - MIN_MATCH_LENGTH]
+var lengthCodes = [...]uint32{
+	0, 1, 2, 3, 4, 5, 6, 7, 8, 8,
+	9, 9, 10, 10, 11, 11, 12, 12, 12, 12,
+	13, 13, 13, 13, 14, 14, 14, 14, 15, 15,
+	15, 15, 16, 16, 16, 16, 16, 16, 16, 16,
+	17, 17, 17, 17, 17, 17, 17, 17, 18, 18,
+	18, 18, 18, 18, 18, 18, 19, 19, 19, 19,
+	19, 19, 19, 19, 20, 20, 20, 20, 20, 20,
+	20, 20, 20, 20, 20, 20, 20, 20, 20, 20,
+	21, 21, 21, 21, 21, 21, 21, 21, 21, 21,
+	21, 21, 21, 21, 21, 21, 22, 22, 22, 22,
+	22, 22, 22, 22, 22, 22, 22, 22, 22, 22,
+	22, 22, 23, 23, 23, 23, 23, 23, 23, 23,
+	23, 23, 23, 23, 23, 23, 23, 23, 24, 24,
+	24, 24, 24, 24, 24, 24, 24, 24, 24, 24,
+	24, 24, 24, 24, 24, 24, 24, 24, 24, 24,
+	24, 24, 24, 24, 24, 24, 24, 24, 24, 24,
+	25, 25, 25, 25, 25, 25, 25, 25, 25, 25,
+	25, 25, 25, 25, 25, 25, 25, 25, 25, 25,
+	25, 25, 25, 25, 25, 25, 25, 25, 25, 25,
+	25, 25, 26, 26, 26, 26, 26, 26, 26, 26,
+	26, 26, 26, 26, 26, 26, 26, 26, 26, 26,
+	26, 26, 26, 26, 26, 26, 26, 26, 26, 26,
+	26, 26, 26, 26, 27, 27, 27, 27, 27, 27,
+	27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
+	27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
+	27, 27, 27, 27, 27, 28,
+}
+
+var offsetCodes = [...]uint32{
+	0, 1, 2, 3, 4, 4, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7,
+	8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9,
+	10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
+	11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
+	12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
+	12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
+	13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
+	13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
+	14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
+	14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
+	14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
+	14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
+	15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
+	15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
+	15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
+	15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
+}
+
+type token uint32
+
+type tokens struct {
+	tokens [maxStoreBlockSize + 1]token
+	n      uint16 // Must be able to contain maxStoreBlockSize
+}
+
+// Convert a literal into a literal token.
+func literalToken(literal uint32) token { return token(literalType + literal) }
+
+// Convert a < xlength, xoffset > pair into a match token.
+func matchToken(xlength uint32, xoffset uint32) token {
+	return token(matchType + xlength<<lengthShift + xoffset)
+}
+
+func matchTokend(xlength uint32, xoffset uint32) token {
+	if xlength > maxMatchLength || xoffset > maxMatchOffset {
+		panic(fmt.Sprintf("Invalid match: len: %d, offset: %d\n", xlength, xoffset))
+		return token(matchType)
+	}
+	return token(matchType + xlength<<lengthShift + xoffset)
+}
+
+// Returns the type of a token
+func (t token) typ() uint32 { return uint32(t) & typeMask }
+
+// Returns the literal of a literal token
+func (t token) literal() uint32 { return uint32(t - literalType) }
+
+// Returns the extra offset of a match token
+func (t token) offset() uint32 { return uint32(t) & offsetMask }
+
+func (t token) length() uint32 { return uint32((t - matchType) >> lengthShift) }
+
+func lengthCode(len uint32) uint32 { return lengthCodes[len] }
+
+// Returns the offset code corresponding to a specific offset
+func offsetCode(off uint32) uint32 {
+	if off < uint32(len(offsetCodes)) {
+		return offsetCodes[off]
+	} else if off>>7 < uint32(len(offsetCodes)) {
+		return offsetCodes[off>>7] + 14
+	} else {
+		return offsetCodes[off>>14] + 28
+	}
+}
--- a/vendor/github.com/klauspost/compress/flate/writer_test.go
+++ b/vendor/github.com/klauspost/compress/flate/writer_test.go
@@ -0,0 +1,258 @@
+// Copyright 2012 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package flate
+
+import (
+	"bytes"
+	"fmt"
+	"io"
+	"io/ioutil"
+	"math/rand"
+	"runtime"
+	"testing"
+)
+
+func benchmarkEncoder(b *testing.B, testfile, level, n int) {
+	b.StopTimer()
+	b.SetBytes(int64(n))
+	buf0, err := ioutil.ReadFile(testfiles[testfile])
+	if err != nil {
+		b.Fatal(err)
+	}
+	if len(buf0) == 0 {
+		b.Fatalf("test file %q has no data", testfiles[testfile])
+	}
+	buf1 := make([]byte, n)
+	for i := 0; i < n; i += len(buf0) {
+		if len(buf0) > n-i {
+			buf0 = buf0[:n-i]
+		}
+		copy(buf1[i:], buf0)
+	}
+	buf0 = nil
+	runtime.GC()
+	w, err := NewWriter(ioutil.Discard, level)
+	b.StartTimer()
+	for i := 0; i < b.N; i++ {
+		w.Reset(ioutil.Discard)
+		_, err = w.Write(buf1)
+		if err != nil {
+			b.Fatal(err)
+		}
+		err = w.Close()
+		if err != nil {
+			b.Fatal(err)
+		}
+	}
+}
+
+func BenchmarkEncodeDigitsConstant1e4(b *testing.B) { benchmarkEncoder(b, digits, constant, 1e4) }
+func BenchmarkEncodeDigitsConstant1e5(b *testing.B) { benchmarkEncoder(b, digits, constant, 1e5) }
+func BenchmarkEncodeDigitsConstant1e6(b *testing.B) { benchmarkEncoder(b, digits, constant, 1e6) }
+func BenchmarkEncodeDigitsSpeed1e4(b *testing.B)    { benchmarkEncoder(b, digits, speed, 1e4) }
+func BenchmarkEncodeDigitsSpeed1e5(b *testing.B)    { benchmarkEncoder(b, digits, speed, 1e5) }
+func BenchmarkEncodeDigitsSpeed1e6(b *testing.B)    { benchmarkEncoder(b, digits, speed, 1e6) }
+func BenchmarkEncodeDigitsDefault1e4(b *testing.B)  { benchmarkEncoder(b, digits, default_, 1e4) }
+func BenchmarkEncodeDigitsDefault1e5(b *testing.B)  { benchmarkEncoder(b, digits, default_, 1e5) }
+func BenchmarkEncodeDigitsDefault1e6(b *testing.B)  { benchmarkEncoder(b, digits, default_, 1e6) }
+func BenchmarkEncodeDigitsCompress1e4(b *testing.B) { benchmarkEncoder(b, digits, compress, 1e4) }
+func BenchmarkEncodeDigitsCompress1e5(b *testing.B) { benchmarkEncoder(b, digits, compress, 1e5) }
+func BenchmarkEncodeDigitsCompress1e6(b *testing.B) { benchmarkEncoder(b, digits, compress, 1e6) }
+func BenchmarkEncodeTwainConstant1e4(b *testing.B)  { benchmarkEncoder(b, twain, constant, 1e4) }
+func BenchmarkEncodeTwainConstant1e5(b *testing.B)  { benchmarkEncoder(b, twain, constant, 1e5) }
+func BenchmarkEncodeTwainConstant1e6(b *testing.B)  { benchmarkEncoder(b, twain, constant, 1e6) }
+func BenchmarkEncodeTwainSpeed1e4(b *testing.B)     { benchmarkEncoder(b, twain, speed, 1e4) }
+func BenchmarkEncodeTwainSpeed1e5(b *testing.B)     { benchmarkEncoder(b, twain, speed, 1e5) }
+func BenchmarkEncodeTwainSpeed1e6(b *testing.B)     { benchmarkEncoder(b, twain, speed, 1e6) }
+func BenchmarkEncodeTwainDefault1e4(b *testing.B)   { benchmarkEncoder(b, twain, default_, 1e4) }
+func BenchmarkEncodeTwainDefault1e5(b *testing.B)   { benchmarkEncoder(b, twain, default_, 1e5) }
+func BenchmarkEncodeTwainDefault1e6(b *testing.B)   { benchmarkEncoder(b, twain, default_, 1e6) }
+func BenchmarkEncodeTwainCompress1e4(b *testing.B)  { benchmarkEncoder(b, twain, compress, 1e4) }
+func BenchmarkEncodeTwainCompress1e5(b *testing.B)  { benchmarkEncoder(b, twain, compress, 1e5) }
+func BenchmarkEncodeTwainCompress1e6(b *testing.B)  { benchmarkEncoder(b, twain, compress, 1e6) }
+
+// A writer that fails after N writes.
+type errorWriter struct {
+	N int
+}
+
+func (e *errorWriter) Write(b []byte) (int, error) {
+	if e.N <= 0 {
+		return 0, io.ErrClosedPipe
+	}
+	e.N--
+	return len(b), nil
+}
+
+// Test if errors from the underlying writer is passed upwards.
+func TestWriteError(t *testing.T) {
+	buf := new(bytes.Buffer)
+	n := 65536
+	if !testing.Short() {
+		n *= 4
+	}
+	for i := 0; i < n; i++ {
+		fmt.Fprintf(buf, "asdasfasf%d%dfghfgujyut%dyutyu\n", i, i, i)
+	}
+	in := buf.Bytes()
+	// We create our own buffer to control number of writes.
+	copyBuf := make([]byte, 128)
+	for l := 0; l < 10; l++ {
+		for fail := 1; fail <= 256; fail *= 2 {
+			// Fail after 'fail' writes
+			ew := &errorWriter{N: fail}
+			w, err := NewWriter(ew, l)
+			if err != nil {
+				t.Fatalf("NewWriter: level %d: %v", l, err)
+			}
+			n, err := copyBuffer(w, bytes.NewBuffer(in), copyBuf)
+			if err == nil {
+				t.Fatalf("Level %d: Expected an error, writer was %#v", l, ew)
+			}
+			n2, err := w.Write([]byte{1, 2, 2, 3, 4, 5})
+			if n2 != 0 {
+				t.Fatal("Level", l, "Expected 0 length write, got", n)
+			}
+			if err == nil {
+				t.Fatal("Level", l, "Expected an error")
+			}
+			err = w.Flush()
+			if err == nil {
+				t.Fatal("Level", l, "Expected an error on flush")
+			}
+			err = w.Close()
+			if err == nil {
+				t.Fatal("Level", l, "Expected an error on close")
+			}
+
+			w.Reset(ioutil.Discard)
+			n2, err = w.Write([]byte{1, 2, 3, 4, 5, 6})
+			if err != nil {
+				t.Fatal("Level", l, "Got unexpected error after reset:", err)
+			}
+			if n2 == 0 {
+				t.Fatal("Level", l, "Got 0 length write, expected > 0")
+			}
+			if testing.Short() {
+				return
+			}
+		}
+	}
+}
+
+func TestDeterministicL1(t *testing.T)  { testDeterministic(1, t) }
+func TestDeterministicL2(t *testing.T)  { testDeterministic(2, t) }
+func TestDeterministicL3(t *testing.T)  { testDeterministic(3, t) }
+func TestDeterministicL4(t *testing.T)  { testDeterministic(4, t) }
+func TestDeterministicL5(t *testing.T)  { testDeterministic(5, t) }
+func TestDeterministicL6(t *testing.T)  { testDeterministic(6, t) }
+func TestDeterministicL7(t *testing.T)  { testDeterministic(7, t) }
+func TestDeterministicL8(t *testing.T)  { testDeterministic(8, t) }
+func TestDeterministicL9(t *testing.T)  { testDeterministic(9, t) }
+func TestDeterministicL0(t *testing.T)  { testDeterministic(0, t) }
+func TestDeterministicLM2(t *testing.T) { testDeterministic(-2, t) }
+
+func testDeterministic(i int, t *testing.T) {
+	// Test so much we cross a good number of block boundaries.
+	var length = maxStoreBlockSize*30 + 500
+	if testing.Short() {
+		length /= 10
+	}
+
+	// Create a random, but compressible stream.
+	rng := rand.New(rand.NewSource(1))
+	t1 := make([]byte, length)
+	for i := range t1 {
+		t1[i] = byte(rng.Int63() & 7)
+	}
+
+	// Do our first encode.
+	var b1 bytes.Buffer
+	br := bytes.NewBuffer(t1)
+	w, err := NewWriter(&b1, i)
+	if err != nil {
+		t.Fatal(err)
+	}
+	// Use a very small prime sized buffer.
+	cbuf := make([]byte, 787)
+	_, err = copyBuffer(w, br, cbuf)
+	if err != nil {
+		t.Fatal(err)
+	}
+	w.Close()
+
+	// We choose a different buffer size,
+	// bigger than a maximum block, and also a prime.
+	var b2 bytes.Buffer
+	cbuf = make([]byte, 81761)
+	br2 := bytes.NewBuffer(t1)
+	w2, err := NewWriter(&b2, i)
+	if err != nil {
+		t.Fatal(err)
+	}
+	_, err = copyBuffer(w2, br2, cbuf)
+	if err != nil {
+		t.Fatal(err)
+	}
+	w2.Close()
+
+	b1b := b1.Bytes()
+	b2b := b2.Bytes()
+
+	if !bytes.Equal(b1b, b2b) {
+		t.Errorf("level %d did not produce deterministic result, result mismatch, len(a) = %d, len(b) = %d", i, len(b1b), len(b2b))
+	}
+
+	// Test using io.WriterTo interface.
+	var b3 bytes.Buffer
+	br = bytes.NewBuffer(t1)
+	w, err = NewWriter(&b3, i)
+	if err != nil {
+		t.Fatal(err)
+	}
+	_, err = br.WriteTo(w)
+	if err != nil {
+		t.Fatal(err)
+	}
+	w.Close()
+
+	b3b := b3.Bytes()
+	if !bytes.Equal(b1b, b3b) {
+		t.Errorf("level %d (io.WriterTo) did not produce deterministic result, result mismatch, len(a) = %d, len(b) = %d", i, len(b1b), len(b3b))
+	}
+}
+
+// copyBuffer is a copy of io.CopyBuffer, since we want to support older go versions.
+// This is modified to never use io.WriterTo or io.ReaderFrom interfaces.
+func copyBuffer(dst io.Writer, src io.Reader, buf []byte) (written int64, err error) {
+	if buf == nil {
+		buf = make([]byte, 32*1024)
+	}
+	for {
+		nr, er := src.Read(buf)
+		if nr > 0 {
+			nw, ew := dst.Write(buf[0:nr])
+			if nw > 0 {
+				written += int64(nw)
+			}
+			if ew != nil {
+				err = ew
+				break
+			}
+			if nr != nw {
+				err = io.ErrShortWrite
+				break
+			}
+		}
+		if er == io.EOF {
+			break
+		}
+		if er != nil {
+			err = er
+			break
+		}
+	}
+	return written, err
+}
--- a/vendor/github.com/klauspost/compress/gzip/example_test.go
+++ b/vendor/github.com/klauspost/compress/gzip/example_test.go
@@ -0,0 +1,128 @@
+// Copyright 2016 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package gzip_test
+
+import (
+	"bytes"
+	"compress/gzip"
+	"fmt"
+	"io"
+	"log"
+	"os"
+	"time"
+)
+
+func Example_writerReader() {
+	var buf bytes.Buffer
+	zw := gzip.NewWriter(&buf)
+
+	// Setting the Header fields is optional.
+	zw.Name = "a-new-hope.txt"
+	zw.Comment = "an epic space opera by George Lucas"
+	zw.ModTime = time.Date(1977, time.May, 25, 0, 0, 0, 0, time.UTC)
+
+	_, err := zw.Write([]byte("A long time ago in a galaxy far, far away..."))
+	if err != nil {
+		log.Fatal(err)
+	}
+
+	if err := zw.Close(); err != nil {
+		log.Fatal(err)
+	}
+
+	zr, err := gzip.NewReader(&buf)
+	if err != nil {
+		log.Fatal(err)
+	}
+
+	fmt.Printf("Name: %s\nComment: %s\nModTime: %s\n\n", zr.Name, zr.Comment, zr.ModTime.UTC())
+
+	if _, err := io.Copy(os.Stdout, zr); err != nil {
+		log.Fatal(err)
+	}
+
+	if err := zr.Close(); err != nil {
+		log.Fatal(err)
+	}
+
+	// Output:
+	// Name: a-new-hope.txt
+	// Comment: an epic space opera by George Lucas
+	// ModTime: 1977-05-25 00:00:00 +0000 UTC
+	//
+	// A long time ago in a galaxy far, far away...
+}
+
+func ExampleReader_Multistream() {
+	var buf bytes.Buffer
+	zw := gzip.NewWriter(&buf)
+
+	var files = []struct {
+		name    string
+		comment string
+		modTime time.Time
+		data    string
+	}{
+		{"file-1.txt", "file-header-1", time.Date(2006, time.February, 1, 3, 4, 5, 0, time.UTC), "Hello Gophers - 1"},
+		{"file-2.txt", "file-header-2", time.Date(2007, time.March, 2, 4, 5, 6, 1, time.UTC), "Hello Gophers - 2"},
+	}
+
+	for _, file := range files {
+		zw.Name = file.name
+		zw.Comment = file.comment
+		zw.ModTime = file.modTime
+
+		if _, err := zw.Write([]byte(file.data)); err != nil {
+			log.Fatal(err)
+		}
+
+		if err := zw.Close(); err != nil {
+			log.Fatal(err)
+		}
+
+		zw.Reset(&buf)
+	}
+
+	zr, err := gzip.NewReader(&buf)
+	if err != nil {
+		log.Fatal(err)
+	}
+
+	for {
+		zr.Multistream(false)
+		fmt.Printf("Name: %s\nComment: %s\nModTime: %s\n\n", zr.Name, zr.Comment, zr.ModTime.UTC())
+
+		if _, err := io.Copy(os.Stdout, zr); err != nil {
+			log.Fatal(err)
+		}
+
+		fmt.Println("\n")
+
+		err = zr.Reset(&buf)
+		if err == io.EOF {
+			break
+		}
+		if err != nil {
+			log.Fatal(err)
+		}
+	}
+
+	if err := zr.Close(); err != nil {
+		log.Fatal(err)
+	}
+
+	// Output:
+	// Name: file-1.txt
+	// Comment: file-header-1
+	// ModTime: 2006-02-01 03:04:05 +0000 UTC
+	//
+	// Hello Gophers - 1
+	//
+	// Name: file-2.txt
+	// Comment: file-header-2
+	// ModTime: 2007-03-02 04:05:06 +0000 UTC
+	//
+	// Hello Gophers - 2
+}
--- a/vendor/github.com/klauspost/compress/gzip/gunzip.go
+++ b/vendor/github.com/klauspost/compress/gzip/gunzip.go
@@ -0,0 +1,344 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// Package gzip implements reading and writing of gzip format compressed files,
+// as specified in RFC 1952.
+package gzip
+
+import (
+	"bufio"
+	"encoding/binary"
+	"errors"
+	"io"
+	"time"
+
+	"github.com/klauspost/compress/flate"
+	"github.com/klauspost/crc32"
+)
+
+const (
+	gzipID1     = 0x1f
+	gzipID2     = 0x8b
+	gzipDeflate = 8
+	flagText    = 1 << 0
+	flagHdrCrc  = 1 << 1
+	flagExtra   = 1 << 2
+	flagName    = 1 << 3
+	flagComment = 1 << 4
+)
+
+var (
+	// ErrChecksum is returned when reading GZIP data that has an invalid checksum.
+	ErrChecksum = errors.New("gzip: invalid checksum")
+	// ErrHeader is returned when reading GZIP data that has an invalid header.
+	ErrHeader = errors.New("gzip: invalid header")
+)
+
+var le = binary.LittleEndian
+
+// noEOF converts io.EOF to io.ErrUnexpectedEOF.
+func noEOF(err error) error {
+	if err == io.EOF {
+		return io.ErrUnexpectedEOF
+	}
+	return err
+}
+
+// The gzip file stores a header giving metadata about the compressed file.
+// That header is exposed as the fields of the Writer and Reader structs.
+//
+// Strings must be UTF-8 encoded and may only contain Unicode code points
+// U+0001 through U+00FF, due to limitations of the GZIP file format.
+type Header struct {
+	Comment string    // comment
+	Extra   []byte    // "extra data"
+	ModTime time.Time // modification time
+	Name    string    // file name
+	OS      byte      // operating system type
+}
+
+// A Reader is an io.Reader that can be read to retrieve
+// uncompressed data from a gzip-format compressed file.
+//
+// In general, a gzip file can be a concatenation of gzip files,
+// each with its own header. Reads from the Reader
+// return the concatenation of the uncompressed data of each.
+// Only the first header is recorded in the Reader fields.
+//
+// Gzip files store a length and checksum of the uncompressed data.
+// The Reader will return a ErrChecksum when Read
+// reaches the end of the uncompressed data if it does not
+// have the expected length or checksum. Clients should treat data
+// returned by Read as tentative until they receive the io.EOF
+// marking the end of the data.
+type Reader struct {
+	Header       // valid after NewReader or Reader.Reset
+	r            flate.Reader
+	decompressor io.ReadCloser
+	digest       uint32 // CRC-32, IEEE polynomial (section 8)
+	size         uint32 // Uncompressed size (section 2.3.1)
+	buf          [512]byte
+	err          error
+	multistream  bool
+}
+
+// NewReader creates a new Reader reading the given reader.
+// If r does not also implement io.ByteReader,
+// the decompressor may read more data than necessary from r.
+//
+// It is the caller's responsibility to call Close on the Reader when done.
+//
+// The Reader.Header fields will be valid in the Reader returned.
+func NewReader(r io.Reader) (*Reader, error) {
+	z := new(Reader)
+	if err := z.Reset(r); err != nil {
+		return nil, err
+	}
+	return z, nil
+}
+
+// Reset discards the Reader z's state and makes it equivalent to the
+// result of its original state from NewReader, but reading from r instead.
+// This permits reusing a Reader rather than allocating a new one.
+func (z *Reader) Reset(r io.Reader) error {
+	*z = Reader{
+		decompressor: z.decompressor,
+		multistream:  true,
+	}
+	if rr, ok := r.(flate.Reader); ok {
+		z.r = rr
+	} else {
+		z.r = bufio.NewReader(r)
+	}
+	z.Header, z.err = z.readHeader()
+	return z.err
+}
+
+// Multistream controls whether the reader supports multistream files.
+//
+// If enabled (the default), the Reader expects the input to be a sequence
+// of individually gzipped data streams, each with its own header and
+// trailer, ending at EOF. The effect is that the concatenation of a sequence
+// of gzipped files is treated as equivalent to the gzip of the concatenation
+// of the sequence. This is standard behavior for gzip readers.
+//
+// Calling Multistream(false) disables this behavior; disabling the behavior
+// can be useful when reading file formats that distinguish individual gzip
+// data streams or mix gzip data streams with other data streams.
+// In this mode, when the Reader reaches the end of the data stream,
+// Read returns io.EOF. If the underlying reader implements io.ByteReader,
+// it will be left positioned just after the gzip stream.
+// To start the next stream, call z.Reset(r) followed by z.Multistream(false).
+// If there is no next stream, z.Reset(r) will return io.EOF.
+func (z *Reader) Multistream(ok bool) {
+	z.multistream = ok
+}
+
+// readString reads a NUL-terminated string from z.r.
+// It treats the bytes read as being encoded as ISO 8859-1 (Latin-1) and
+// will output a string encoded using UTF-8.
+// This method always updates z.digest with the data read.
+func (z *Reader) readString() (string, error) {
+	var err error
+	needConv := false
+	for i := 0; ; i++ {
+		if i >= len(z.buf) {
+			return "", ErrHeader
+		}
+		z.buf[i], err = z.r.ReadByte()
+		if err != nil {
+			return "", err
+		}
+		if z.buf[i] > 0x7f {
+			needConv = true
+		}
+		if z.buf[i] == 0 {
+			// Digest covers the NUL terminator.
+			z.digest = crc32.Update(z.digest, crc32.IEEETable, z.buf[:i+1])
+
+			// Strings are ISO 8859-1, Latin-1 (RFC 1952, section 2.3.1).
+			if needConv {
+				s := make([]rune, 0, i)
+				for _, v := range z.buf[:i] {
+					s = append(s, rune(v))
+				}
+				return string(s), nil
+			}
+			return string(z.buf[:i]), nil
+		}
+	}
+}
+
+// readHeader reads the GZIP header according to section 2.3.1.
+// This method does not set z.err.
+func (z *Reader) readHeader() (hdr Header, err error) {
+	if _, err = io.ReadFull(z.r, z.buf[:10]); err != nil {
+		// RFC 1952, section 2.2, says the following:
+		//	A gzip file consists of a series of "members" (compressed data sets).
+		//
+		// Other than this, the specification does not clarify whether a
+		// "series" is defined as "one or more" or "zero or more". To err on the
+		// side of caution, Go interprets this to mean "zero or more".
+		// Thus, it is okay to return io.EOF here.
+		return hdr, err
+	}
+	if z.buf[0] != gzipID1 || z.buf[1] != gzipID2 || z.buf[2] != gzipDeflate {
+		return hdr, ErrHeader
+	}
+	flg := z.buf[3]
+	hdr.ModTime = time.Unix(int64(le.Uint32(z.buf[4:8])), 0)
+	// z.buf[8] is XFL and is currently ignored.
+	hdr.OS = z.buf[9]
+	z.digest = crc32.ChecksumIEEE(z.buf[:10])
+
+	if flg&flagExtra != 0 {
+		if _, err = io.ReadFull(z.r, z.buf[:2]); err != nil {
+			return hdr, noEOF(err)
+		}
+		z.digest = crc32.Update(z.digest, crc32.IEEETable, z.buf[:2])
+		data := make([]byte, le.Uint16(z.buf[:2]))
+		if _, err = io.ReadFull(z.r, data); err != nil {
+			return hdr, noEOF(err)
+		}
+		z.digest = crc32.Update(z.digest, crc32.IEEETable, data)
+		hdr.Extra = data
+	}
+
+	var s string
+	if flg&flagName != 0 {
+		if s, err = z.readString(); err != nil {
+			return hdr, err
+		}
+		hdr.Name = s
+	}
+
+	if flg&flagComment != 0 {
+		if s, err = z.readString(); err != nil {
+			return hdr, err
+		}
+		hdr.Comment = s
+	}
+
+	if flg&flagHdrCrc != 0 {
+		if _, err = io.ReadFull(z.r, z.buf[:2]); err != nil {
+			return hdr, noEOF(err)
+		}
+		digest := le.Uint16(z.buf[:2])
+		if digest != uint16(z.digest) {
+			return hdr, ErrHeader
+		}
+	}
+
+	z.digest = 0
+	if z.decompressor == nil {
+		z.decompressor = flate.NewReader(z.r)
+	} else {
+		z.decompressor.(flate.Resetter).Reset(z.r, nil)
+	}
+	return hdr, nil
+}
+
+// Read implements io.Reader, reading uncompressed bytes from its underlying Reader.
+func (z *Reader) Read(p []byte) (n int, err error) {
+	if z.err != nil {
+		return 0, z.err
+	}
+
+	n, z.err = z.decompressor.Read(p)
+	z.digest = crc32.Update(z.digest, crc32.IEEETable, p[:n])
+	z.size += uint32(n)
+	if z.err != io.EOF {
+		// In the normal case we return here.
+		return n, z.err
+	}
+
+	// Finished file; check checksum and size.
+	if _, err := io.ReadFull(z.r, z.buf[:8]); err != nil {
+		z.err = noEOF(err)
+		return n, z.err
+	}
+	digest := le.Uint32(z.buf[:4])
+	size := le.Uint32(z.buf[4:8])
+	if digest != z.digest || size != z.size {
+		z.err = ErrChecksum
+		return n, z.err
+	}
+	z.digest, z.size = 0, 0
+
+	// File is ok; check if there is another.
+	if !z.multistream {
+		return n, io.EOF
+	}
+	z.err = nil // Remove io.EOF
+
+	if _, z.err = z.readHeader(); z.err != nil {
+		return n, z.err
+	}
+
+	// Read from next file, if necessary.
+	if n > 0 {
+		return n, nil
+	}
+	return z.Read(p)
+}
+
+// Support the io.WriteTo interface for io.Copy and friends.
+func (z *Reader) WriteTo(w io.Writer) (int64, error) {
+	total := int64(0)
+	crcWriter := crc32.NewIEEE()
+	for {
+		if z.err != nil {
+			if z.err == io.EOF {
+				return total, nil
+			}
+			return total, z.err
+		}
+
+		// We write both to output and digest.
+		mw := io.MultiWriter(w, crcWriter)
+		n, err := z.decompressor.(io.WriterTo).WriteTo(mw)
+		total += n
+		z.size += uint32(n)
+		if err != nil {
+			z.err = err
+			return total, z.err
+		}
+
+		// Finished file; check checksum + size.
+		if _, err := io.ReadFull(z.r, z.buf[0:8]); err != nil {
+			if err == io.EOF {
+				err = io.ErrUnexpectedEOF
+			}
+			z.err = err
+			return total, err
+		}
+		z.digest = crcWriter.Sum32()
+		digest := le.Uint32(z.buf[:4])
+		size := le.Uint32(z.buf[4:8])
+		if digest != z.digest || size != z.size {
+			z.err = ErrChecksum
+			return total, z.err
+		}
+		z.digest, z.size = 0, 0
+
+		// File is ok; check if there is another.
+		if !z.multistream {
+			return total, nil
+		}
+		crcWriter.Reset()
+		z.err = nil // Remove io.EOF
+
+		if _, z.err = z.readHeader(); z.err != nil {
+			if z.err == io.EOF {
+				return total, nil
+			}
+			return total, z.err
+		}
+	}
+}
+
+// Close closes the Reader. It does not close the underlying io.Reader.
+// In order for the GZIP checksum to be verified, the reader must be
+// fully consumed until the io.EOF.
+func (z *Reader) Close() error { return z.decompressor.Close() }
--- a/vendor/github.com/klauspost/compress/gzip/gunzip_test.go
+++ b/vendor/github.com/klauspost/compress/gzip/gunzip_test.go
@@ -0,0 +1,682 @@
+// Copyright 2009 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package gzip
+
+import (
+	"bytes"
+	oldgz "compress/gzip"
+	"crypto/rand"
+	"io"
+	"io/ioutil"
+	"os"
+	"strings"
+	"testing"
+	"time"
+
+	"github.com/klauspost/compress/flate"
+)
+
+type gunzipTest struct {
+	name string
+	desc string
+	raw  string
+	gzip []byte
+	err  error
+}
+
+var gunzipTests = []gunzipTest{
+	{ // has 1 empty fixed-huffman block
+		"empty.txt",
+		"empty.txt",
+		"",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0xf7, 0x5e, 0x14, 0x4a,
+			0x00, 0x03, 0x65, 0x6d, 0x70, 0x74, 0x79, 0x2e,
+			0x74, 0x78, 0x74, 0x00, 0x03, 0x00, 0x00, 0x00,
+			0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+		},
+		nil,
+	},
+	{
+		"",
+		"empty - with no file name",
+		"",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x00, 0x00, 0x09, 0x6e, 0x88,
+			0x00, 0xff, 0x01, 0x00, 0x00, 0xff, 0xff, 0x00,
+			0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+		},
+		nil,
+	},
+	{ // has 1 non-empty fixed huffman block
+		"hello.txt",
+		"hello.txt",
+		"hello world\n",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0xc8, 0x58, 0x13, 0x4a,
+			0x00, 0x03, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x2e,
+			0x74, 0x78, 0x74, 0x00, 0xcb, 0x48, 0xcd, 0xc9,
+			0xc9, 0x57, 0x28, 0xcf, 0x2f, 0xca, 0x49, 0xe1,
+			0x02, 0x00, 0x2d, 0x3b, 0x08, 0xaf, 0x0c, 0x00,
+			0x00, 0x00,
+		},
+		nil,
+	},
+	{ // concatenation
+		"hello.txt",
+		"hello.txt x2",
+		"hello world\n" +
+			"hello world\n",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0xc8, 0x58, 0x13, 0x4a,
+			0x00, 0x03, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x2e,
+			0x74, 0x78, 0x74, 0x00, 0xcb, 0x48, 0xcd, 0xc9,
+			0xc9, 0x57, 0x28, 0xcf, 0x2f, 0xca, 0x49, 0xe1,
+			0x02, 0x00, 0x2d, 0x3b, 0x08, 0xaf, 0x0c, 0x00,
+			0x00, 0x00,
+			0x1f, 0x8b, 0x08, 0x08, 0xc8, 0x58, 0x13, 0x4a,
+			0x00, 0x03, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x2e,
+			0x74, 0x78, 0x74, 0x00, 0xcb, 0x48, 0xcd, 0xc9,
+			0xc9, 0x57, 0x28, 0xcf, 0x2f, 0xca, 0x49, 0xe1,
+			0x02, 0x00, 0x2d, 0x3b, 0x08, 0xaf, 0x0c, 0x00,
+			0x00, 0x00,
+		},
+		nil,
+	},
+	{ // has a fixed huffman block with some length-distance pairs
+		"shesells.txt",
+		"shesells.txt",
+		"she sells seashells by the seashore\n",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0x72, 0x66, 0x8b, 0x4a,
+			0x00, 0x03, 0x73, 0x68, 0x65, 0x73, 0x65, 0x6c,
+			0x6c, 0x73, 0x2e, 0x74, 0x78, 0x74, 0x00, 0x2b,
+			0xce, 0x48, 0x55, 0x28, 0x4e, 0xcd, 0xc9, 0x29,
+			0x06, 0x92, 0x89, 0xc5, 0x19, 0x60, 0x56, 0x52,
+			0xa5, 0x42, 0x09, 0x58, 0x18, 0x28, 0x90, 0x5f,
+			0x94, 0xca, 0x05, 0x00, 0x76, 0xb0, 0x3b, 0xeb,
+			0x24, 0x00, 0x00, 0x00,
+		},
+		nil,
+	},
+	{ // has dynamic huffman blocks
+		"gettysburg",
+		"gettysburg",
+		"  Four score and seven years ago our fathers brought forth on\n" +
+			"this continent, a new nation, conceived in Liberty, and dedicated\n" +
+			"to the proposition that all men are created equal.\n" +
+			"  Now we are engaged in a great Civil War, testing whether that\n" +
+			"nation, or any nation so conceived and so dedicated, can long\n" +
+			"endure.\n" +
+			"  We are met on a great battle-field of that war.\n" +
+			"  We have come to dedicate a portion of that field, as a final\n" +
+			"resting place for those who here gave their lives that that\n" +
+			"nation might live.  It is altogether fitting and proper that\n" +
+			"we should do this.\n" +
+			"  But, in a larger sense, we can not dedicate — we can not\n" +
+			"consecrate — we can not hallow — this ground.\n" +
+			"  The brave men, living and dead, who struggled here, have\n" +
+			"consecrated it, far above our poor power to add or detract.\n" +
+			"The world will little note, nor long remember what we say here,\n" +
+			"but it can never forget what they did here.\n" +
+			"  It is for us the living, rather, to be dedicated here to the\n" +
+			"unfinished work which they who fought here have thus far so\n" +
+			"nobly advanced.  It is rather for us to be here dedicated to\n" +
+			"the great task remaining before us — that from these honored\n" +
+			"dead we take increased devotion to that cause for which they\n" +
+			"gave the last full measure of devotion —\n" +
+			"  that we here highly resolve that these dead shall not have\n" +
+			"died in vain — that this nation, under God, shall have a new\n" +
+			"birth of freedom — and that government of the people, by the\n" +
+			"people, for the people, shall not perish from this earth.\n" +
+			"\n" +
+			"Abraham Lincoln, November 19, 1863, Gettysburg, Pennsylvania\n",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0xd1, 0x12, 0x2b, 0x4a,
+			0x00, 0x03, 0x67, 0x65, 0x74, 0x74, 0x79, 0x73,
+			0x62, 0x75, 0x72, 0x67, 0x00, 0x65, 0x54, 0xcd,
+			0x6e, 0xd4, 0x30, 0x10, 0xbe, 0xfb, 0x29, 0xe6,
+			0x01, 0x42, 0xa5, 0x0a, 0x09, 0xc1, 0x11, 0x90,
+			0x40, 0x48, 0xa8, 0xe2, 0x80, 0xd4, 0xf3, 0x24,
+			0x9e, 0x24, 0x56, 0xbd, 0x9e, 0xc5, 0x76, 0x76,
+			0x95, 0x1b, 0x0f, 0xc1, 0x13, 0xf2, 0x24, 0x7c,
+			0x63, 0x77, 0x9b, 0x4a, 0x5c, 0xaa, 0x6e, 0x6c,
+			0xcf, 0x7c, 0x7f, 0x33, 0x44, 0x5f, 0x74, 0xcb,
+			0x54, 0x26, 0xcd, 0x42, 0x9c, 0x3c, 0x15, 0xb9,
+			0x48, 0xa2, 0x5d, 0x38, 0x17, 0xe2, 0x45, 0xc9,
+			0x4e, 0x67, 0xae, 0xab, 0xe0, 0xf7, 0x98, 0x75,
+			0x5b, 0xd6, 0x4a, 0xb3, 0xe6, 0xba, 0x92, 0x26,
+			0x57, 0xd7, 0x50, 0x68, 0xd2, 0x54, 0x43, 0x92,
+			0x54, 0x07, 0x62, 0x4a, 0x72, 0xa5, 0xc4, 0x35,
+			0x68, 0x1a, 0xec, 0x60, 0x92, 0x70, 0x11, 0x4f,
+			0x21, 0xd1, 0xf7, 0x30, 0x4a, 0xae, 0xfb, 0xd0,
+			0x9a, 0x78, 0xf1, 0x61, 0xe2, 0x2a, 0xde, 0x55,
+			0x25, 0xd4, 0xa6, 0x73, 0xd6, 0xb3, 0x96, 0x60,
+			0xef, 0xf0, 0x9b, 0x2b, 0x71, 0x8c, 0x74, 0x02,
+			0x10, 0x06, 0xac, 0x29, 0x8b, 0xdd, 0x25, 0xf9,
+			0xb5, 0x71, 0xbc, 0x73, 0x44, 0x0f, 0x7a, 0xa5,
+			0xab, 0xb4, 0x33, 0x49, 0x0b, 0x2f, 0xbd, 0x03,
+			0xd3, 0x62, 0x17, 0xe9, 0x73, 0xb8, 0x84, 0x48,
+			0x8f, 0x9c, 0x07, 0xaa, 0x52, 0x00, 0x6d, 0xa1,
+			0xeb, 0x2a, 0xc6, 0xa0, 0x95, 0x76, 0x37, 0x78,
+			0x9a, 0x81, 0x65, 0x7f, 0x46, 0x4b, 0x45, 0x5f,
+			0xe1, 0x6d, 0x42, 0xe8, 0x01, 0x13, 0x5c, 0x38,
+			0x51, 0xd4, 0xb4, 0x38, 0x49, 0x7e, 0xcb, 0x62,
+			0x28, 0x1e, 0x3b, 0x82, 0x93, 0x54, 0x48, 0xf1,
+			0xd2, 0x7d, 0xe4, 0x5a, 0xa3, 0xbc, 0x99, 0x83,
+			0x44, 0x4f, 0x3a, 0x77, 0x36, 0x57, 0xce, 0xcf,
+			0x2f, 0x56, 0xbe, 0x80, 0x90, 0x9e, 0x84, 0xea,
+			0x51, 0x1f, 0x8f, 0xcf, 0x90, 0xd4, 0x60, 0xdc,
+			0x5e, 0xb4, 0xf7, 0x10, 0x0b, 0x26, 0xe0, 0xff,
+			0xc4, 0xd1, 0xe5, 0x67, 0x2e, 0xe7, 0xc8, 0x93,
+			0x98, 0x05, 0xb8, 0xa8, 0x45, 0xc0, 0x4d, 0x09,
+			0xdc, 0x84, 0x16, 0x2b, 0x0d, 0x9a, 0x21, 0x53,
+			0x04, 0x8b, 0xd2, 0x0b, 0xbd, 0xa2, 0x4c, 0xa7,
+			0x60, 0xee, 0xd9, 0xe1, 0x1d, 0xd1, 0xb7, 0x4a,
+			0x30, 0x8f, 0x63, 0xd5, 0xa5, 0x8b, 0x33, 0x87,
+			0xda, 0x1a, 0x18, 0x79, 0xf3, 0xe3, 0xa6, 0x17,
+			0x94, 0x2e, 0xab, 0x6e, 0xa0, 0xe3, 0xcd, 0xac,
+			0x50, 0x8c, 0xca, 0xa7, 0x0d, 0x76, 0x37, 0xd1,
+			0x23, 0xe7, 0x05, 0x57, 0x8b, 0xa4, 0x22, 0x83,
+			0xd9, 0x62, 0x52, 0x25, 0xad, 0x07, 0xbb, 0xbf,
+			0xbf, 0xff, 0xbc, 0xfa, 0xee, 0x20, 0x73, 0x91,
+			0x29, 0xff, 0x7f, 0x02, 0x71, 0x62, 0x84, 0xb5,
+			0xf6, 0xb5, 0x25, 0x6b, 0x41, 0xde, 0x92, 0xb7,
+			0x76, 0x3f, 0x91, 0x91, 0x31, 0x1b, 0x41, 0x84,
+			0x62, 0x30, 0x0a, 0x37, 0xa4, 0x5e, 0x18, 0x3a,
+			0x99, 0x08, 0xa5, 0xe6, 0x6d, 0x59, 0x22, 0xec,
+			0x33, 0x39, 0x86, 0x26, 0xf5, 0xab, 0x66, 0xc8,
+			0x08, 0x20, 0xcf, 0x0c, 0xd7, 0x47, 0x45, 0x21,
+			0x0b, 0xf6, 0x59, 0xd5, 0xfe, 0x5c, 0x8d, 0xaa,
+			0x12, 0x7b, 0x6f, 0xa1, 0xf0, 0x52, 0x33, 0x4f,
+			0xf5, 0xce, 0x59, 0xd3, 0xab, 0x66, 0x10, 0xbf,
+			0x06, 0xc4, 0x31, 0x06, 0x73, 0xd6, 0x80, 0xa2,
+			0x78, 0xc2, 0x45, 0xcb, 0x03, 0x65, 0x39, 0xc9,
+			0x09, 0xd1, 0x06, 0x04, 0x33, 0x1a, 0x5a, 0xf1,
+			0xde, 0x01, 0xb8, 0x71, 0x83, 0xc4, 0xb5, 0xb3,
+			0xc3, 0x54, 0x65, 0x33, 0x0d, 0x5a, 0xf7, 0x9b,
+			0x90, 0x7c, 0x27, 0x1f, 0x3a, 0x58, 0xa3, 0xd8,
+			0xfd, 0x30, 0x5f, 0xb7, 0xd2, 0x66, 0xa2, 0x93,
+			0x1c, 0x28, 0xb7, 0xe9, 0x1b, 0x0c, 0xe1, 0x28,
+			0x47, 0x26, 0xbb, 0xe9, 0x7d, 0x7e, 0xdc, 0x96,
+			0x10, 0x92, 0x50, 0x56, 0x7c, 0x06, 0xe2, 0x27,
+			0xb4, 0x08, 0xd3, 0xda, 0x7b, 0x98, 0x34, 0x73,
+			0x9f, 0xdb, 0xf6, 0x62, 0xed, 0x31, 0x41, 0x13,
+			0xd3, 0xa2, 0xa8, 0x4b, 0x3a, 0xc6, 0x1d, 0xe4,
+			0x2f, 0x8c, 0xf8, 0xfb, 0x97, 0x64, 0xf4, 0xb6,
+			0x2f, 0x80, 0x5a, 0xf3, 0x56, 0xe0, 0x40, 0x50,
+			0xd5, 0x19, 0xd0, 0x1e, 0xfc, 0xca, 0xe5, 0xc9,
+			0xd4, 0x60, 0x00, 0x81, 0x2e, 0xa3, 0xcc, 0xb6,
+			0x52, 0xf0, 0xb4, 0xdb, 0x69, 0x99, 0xce, 0x7a,
+			0x32, 0x4c, 0x08, 0xed, 0xaa, 0x10, 0x10, 0xe3,
+			0x6f, 0xee, 0x99, 0x68, 0x95, 0x9f, 0x04, 0x71,
+			0xb2, 0x49, 0x2f, 0x62, 0xa6, 0x5e, 0xb4, 0xef,
+			0x02, 0xed, 0x4f, 0x27, 0xde, 0x4a, 0x0f, 0xfd,
+			0xc1, 0xcc, 0xdd, 0x02, 0x8f, 0x08, 0x16, 0x54,
+			0xdf, 0xda, 0xca, 0xe0, 0x82, 0xf1, 0xb4, 0x31,
+			0x7a, 0xa9, 0x81, 0xfe, 0x90, 0xb7, 0x3e, 0xdb,
+			0xd3, 0x35, 0xc0, 0x20, 0x80, 0x33, 0x46, 0x4a,
+			0x63, 0xab, 0xd1, 0x0d, 0x29, 0xd2, 0xe2, 0x84,
+			0xb8, 0xdb, 0xfa, 0xe9, 0x89, 0x44, 0x86, 0x7c,
+			0xe8, 0x0b, 0xe6, 0x02, 0x6a, 0x07, 0x9b, 0x96,
+			0xd0, 0xdb, 0x2e, 0x41, 0x4c, 0xa1, 0xd5, 0x57,
+			0x45, 0x14, 0xfb, 0xe3, 0xa6, 0x72, 0x5b, 0x87,
+			0x6e, 0x0c, 0x6d, 0x5b, 0xce, 0xe0, 0x2f, 0xe2,
+			0x21, 0x81, 0x95, 0xb0, 0xe8, 0xb6, 0x32, 0x0b,
+			0xb2, 0x98, 0x13, 0x52, 0x5d, 0xfb, 0xec, 0x63,
+			0x17, 0x8a, 0x9e, 0x23, 0x22, 0x36, 0xee, 0xcd,
+			0xda, 0xdb, 0xcf, 0x3e, 0xf1, 0xc7, 0xf1, 0x01,
+			0x12, 0x93, 0x0a, 0xeb, 0x6f, 0xf2, 0x02, 0x15,
+			0x96, 0x77, 0x5d, 0xef, 0x9c, 0xfb, 0x88, 0x91,
+			0x59, 0xf9, 0x84, 0xdd, 0x9b, 0x26, 0x8d, 0x80,
+			0xf9, 0x80, 0x66, 0x2d, 0xac, 0xf7, 0x1f, 0x06,
+			0xba, 0x7f, 0xff, 0xee, 0xed, 0x40, 0x5f, 0xa5,
+			0xd6, 0xbd, 0x8c, 0x5b, 0x46, 0xd2, 0x7e, 0x48,
+			0x4a, 0x65, 0x8f, 0x08, 0x42, 0x60, 0xf7, 0x0f,
+			0xb9, 0x16, 0x0b, 0x0c, 0x1a, 0x06, 0x00, 0x00,
+		},
+		nil,
+	},
+	{ // has 1 non-empty fixed huffman block then garbage
+		"hello.txt",
+		"hello.txt + garbage",
+		"hello world\n",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0xc8, 0x58, 0x13, 0x4a,
+			0x00, 0x03, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x2e,
+			0x74, 0x78, 0x74, 0x00, 0xcb, 0x48, 0xcd, 0xc9,
+			0xc9, 0x57, 0x28, 0xcf, 0x2f, 0xca, 0x49, 0xe1,
+			0x02, 0x00, 0x2d, 0x3b, 0x08, 0xaf, 0x0c, 0x00,
+			0x00, 0x00, 'g', 'a', 'r', 'b', 'a', 'g', 'e', '!', '!', '!',
+		},
+		ErrHeader,
+	},
+	{ // has 1 non-empty fixed huffman block not enough header
+		"hello.txt",
+		"hello.txt + garbage",
+		"hello world\n",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0xc8, 0x58, 0x13, 0x4a,
+			0x00, 0x03, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x2e,
+			0x74, 0x78, 0x74, 0x00, 0xcb, 0x48, 0xcd, 0xc9,
+			0xc9, 0x57, 0x28, 0xcf, 0x2f, 0xca, 0x49, 0xe1,
+			0x02, 0x00, 0x2d, 0x3b, 0x08, 0xaf, 0x0c, 0x00,
+			0x00, 0x00, gzipID1,
+		},
+		io.ErrUnexpectedEOF,
+	},
+	{ // has 1 non-empty fixed huffman block but corrupt checksum
+		"hello.txt",
+		"hello.txt + corrupt checksum",
+		"hello world\n",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0xc8, 0x58, 0x13, 0x4a,
+			0x00, 0x03, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x2e,
+			0x74, 0x78, 0x74, 0x00, 0xcb, 0x48, 0xcd, 0xc9,
+			0xc9, 0x57, 0x28, 0xcf, 0x2f, 0xca, 0x49, 0xe1,
+			0x02, 0x00, 0xff, 0xff, 0xff, 0xff, 0x0c, 0x00,
+			0x00, 0x00,
+		},
+		ErrChecksum,
+	},
+	{ // has 1 non-empty fixed huffman block but corrupt size
+		"hello.txt",
+		"hello.txt + corrupt size",
+		"hello world\n",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x08, 0xc8, 0x58, 0x13, 0x4a,
+			0x00, 0x03, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x2e,
+			0x74, 0x78, 0x74, 0x00, 0xcb, 0x48, 0xcd, 0xc9,
+			0xc9, 0x57, 0x28, 0xcf, 0x2f, 0xca, 0x49, 0xe1,
+			0x02, 0x00, 0x2d, 0x3b, 0x08, 0xaf, 0xff, 0x00,
+			0x00, 0x00,
+		},
+		ErrChecksum,
+	},
+	{
+		"f1l3n4m3.tXt",
+		"header with all fields used",
+		"",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x1e, 0x70, 0xf0, 0xf9, 0x4a,
+			0x00, 0xaa, 0x09, 0x00, 0x7a, 0x7a, 0x05, 0x00,
+			0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x31, 0x6c,
+			0x33, 0x6e, 0x34, 0x6d, 0x33, 0x2e, 0x74, 0x58,
+			0x74, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06,
+			0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e,
+			0x0f, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16,
+			0x17, 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e,
+			0x1f, 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26,
+			0x27, 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e,
+			0x2f, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36,
+			0x37, 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e,
+			0x3f, 0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46,
+			0x47, 0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e,
+			0x4f, 0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56,
+			0x57, 0x58, 0x59, 0x5a, 0x5b, 0x5c, 0x5d, 0x5e,
+			0x5f, 0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66,
+			0x67, 0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e,
+			0x6f, 0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76,
+			0x77, 0x78, 0x79, 0x7a, 0x7b, 0x7c, 0x7d, 0x7e,
+			0x7f, 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86,
+			0x87, 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e,
+			0x8f, 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96,
+			0x97, 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e,
+			0x9f, 0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6,
+			0xa7, 0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae,
+			0xaf, 0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6,
+			0xb7, 0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe,
+			0xbf, 0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6,
+			0xc7, 0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce,
+			0xcf, 0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6,
+			0xd7, 0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde,
+			0xdf, 0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6,
+			0xe7, 0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee,
+			0xef, 0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6,
+			0xf7, 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe,
+			0xff, 0x00, 0x92, 0xfd, 0x01, 0x00, 0x00, 0xff,
+			0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+			0x00,
+		},
+		nil,
+	},
+	{
+		"",
+		"truncated gzip file amid raw-block",
+		"hello",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff,
+			0x00, 0x0c, 0x00, 0xf3, 0xff, 0x68, 0x65, 0x6c, 0x6c, 0x6f,
+		},
+		io.ErrUnexpectedEOF,
+	},
+	{
+		"",
+		"truncated gzip file amid fixed-block",
+		"He",
+		[]byte{
+			0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff,
+			0xf2, 0x48, 0xcd,
+		},
+		io.ErrUnexpectedEOF,
+	},
+}
+
+func TestDecompressor(t *testing.T) {
+	b := new(bytes.Buffer)
+	for _, tt := range gunzipTests {
+		in := bytes.NewReader(tt.gzip)
+		gzip, err := NewReader(in)
+		if err != nil {
+			t.Errorf("%s: NewReader: %s", tt.name, err)
+			continue
+		}
+		defer gzip.Close()
+		if tt.name != gzip.Name {
+			t.Errorf("%s: got name %s", tt.name, gzip.Name)
+		}
+		b.Reset()
+		n, err := io.Copy(b, gzip)
+		if err != tt.err {
+			t.Errorf("%s: io.Copy: %v want %v", tt.name, err, tt.err)
+		}
+		s := b.String()
+		if s != tt.raw {
+			t.Errorf("%s: got %d-byte %q want %d-byte %q", tt.name, n, s, len(tt.raw), tt.raw)
+		}
+
+		// Test Reader Reset.
+		in = bytes.NewReader(tt.gzip)
+		err = gzip.Reset(in)
+		if err != nil {
+			t.Errorf("%s: Reset: %s", tt.name, err)
+			continue
+		}
+		if tt.name != gzip.Name {
+			t.Errorf("%s: got name %s", tt.name, gzip.Name)
+		}
+		b.Reset()
+		n, err = io.Copy(b, gzip)
+		if err != tt.err {
+			t.Errorf("%s: io.Copy: %v want %v", tt.name, err, tt.err)
+		}
+		s = b.String()
+		if s != tt.raw {
+			t.Errorf("%s: got %d-byte %q want %d-byte %q", tt.name, n, s, len(tt.raw), tt.raw)
+		}
+	}
+}
+
+func TestIssue6550(t *testing.T) {
+	f, err := os.Open("testdata/issue6550.gz")
+	if err != nil {
+		t.Fatal(err)
+	}
+	gzip, err := NewReader(f)
+	if err != nil {
+		t.Fatalf("NewReader(testdata/issue6550.gz): %v", err)
+	}
+	defer gzip.Close()
+	done := make(chan bool, 1)
+	go func() {
+		_, err := io.Copy(ioutil.Discard, gzip)
+		if err == nil {
+			t.Errorf("Copy succeeded")
+		} else {
+			t.Logf("Copy failed (correctly): %v", err)
+		}
+		done <- true
+	}()
+	select {
+	case <-time.After(1 * time.Second):
+		t.Errorf("Copy hung")
+	case <-done:
+		// ok
+	}
+}
+
+func TestInitialReset(t *testing.T) {
+	var r Reader
+	if err := r.Reset(bytes.NewReader(gunzipTests[1].gzip)); err != nil {
+		t.Error(err)
+	}
+	var buf bytes.Buffer
+	if _, err := io.Copy(&buf, &r); err != nil {
+		t.Error(err)
+	}
+	if s := buf.String(); s != gunzipTests[1].raw {
+		t.Errorf("got %q want %q", s, gunzipTests[1].raw)
+	}
+}
+
+func TestMultistreamFalse(t *testing.T) {
+	// Find concatenation test.
+	var tt gunzipTest
+	for _, tt = range gunzipTests {
+		if strings.HasSuffix(tt.desc, " x2") {
+			goto Found
+		}
+	}
+	t.Fatal("cannot find hello.txt x2 in gunzip tests")
+
+Found:
+	br := bytes.NewReader(tt.gzip)
+	var r Reader
+	if err := r.Reset(br); err != nil {
+		t.Fatalf("first reset: %v", err)
+	}
+
+	// Expect two streams with "hello world\n", then real EOF.
+	const hello = "hello world\n"
+
+	r.Multistream(false)
+	data, err := ioutil.ReadAll(&r)
+	if string(data) != hello || err != nil {
+		t.Fatalf("first stream = %q, %v, want %q, %v", string(data), err, hello, nil)
+	}
+
+	if err := r.Reset(br); err != nil {
+		t.Fatalf("second reset: %v", err)
+	}
+	r.Multistream(false)
+	data, err = ioutil.ReadAll(&r)
+	if string(data) != hello || err != nil {
+		t.Fatalf("second stream = %q, %v, want %q, %v", string(data), err, hello, nil)
+	}
+
+	if err := r.Reset(br); err != io.EOF {
+		t.Fatalf("third reset: err=%v, want io.EOF", err)
+	}
+}
+
+func TestWriteTo(t *testing.T) {
+	input := make([]byte, 100000)
+	n, err := rand.Read(input)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if n != len(input) {
+		t.Fatal("did not fill buffer")
+	}
+	compressed := &bytes.Buffer{}
+	// Do it twice to test MultiStream functionality
+	for i := 0; i < 2; i++ {
+		w, err := NewWriterLevel(compressed, -2)
+		if err != nil {
+			t.Fatal(err)
+		}
+		n, err = w.Write(input)
+		if err != nil {
+			t.Fatal(err)
+		}
+		if n != len(input) {
+			t.Fatal("did not fill buffer")
+		}
+		w.Close()
+	}
+	input = append(input, input...)
+	buf := compressed.Bytes()
+
+	dec, err := NewReader(bytes.NewBuffer(buf))
+	if err != nil {
+		t.Fatal(err)
+	}
+	// ReadAll does not use WriteTo, but we wrap it in a NopCloser to be sure.
+	readall, err := ioutil.ReadAll(ioutil.NopCloser(dec))
+	if err != nil {
+		t.Fatal(err)
+	}
+	if len(readall) != len(input) {
+		t.Errorf("did not decompress everything, want %d, got %d", len(input), len(readall))
+	}
+	if bytes.Compare(readall, input) != 0 {
+		t.Error("output did not match input")
+	}
+
+	dec, err = NewReader(bytes.NewBuffer(buf))
+	if err != nil {
+		t.Fatal(err)
+	}
+	wtbuf := &bytes.Buffer{}
+	written, err := dec.WriteTo(wtbuf)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if written != int64(len(input)) {
+		t.Error("Returned length did not match, expected", len(input), "got", written)
+	}
+	if wtbuf.Len() != len(input) {
+		t.Error("Actual Length did not match, expected", len(input), "got", wtbuf.Len())
+	}
+	if bytes.Compare(wtbuf.Bytes(), input) != 0 {
+		t.Fatal("output did not match input")
+	}
+}
+
+func TestNilStream(t *testing.T) {
+	// Go liberally interprets RFC 1952 section 2.2 to mean that a gzip file
+	// consist of zero or more members. Thus, we test that a nil stream is okay.
+	_, err := NewReader(bytes.NewReader(nil))
+	if err != io.EOF {
+		t.Fatalf("NewReader(nil) on empty stream: got %v, want io.EOF", err)
+	}
+}
+
+func TestTruncatedStreams(t *testing.T) {
+	const data = "\x1f\x8b\b\x04\x00\tn\x88\x00\xff\a\x00foo bar\xcbH\xcd\xc9\xc9\xd7Q(\xcf/\xcaI\x01\x04:r\xab\xff\f\x00\x00\x00"
+
+	// Intentionally iterate starting with at least one byte in the stream.
+	for i := 1; i < len(data)-1; i++ {
+		r, err := NewReader(strings.NewReader(data[:i]))
+		if err != nil {
+			if err != io.ErrUnexpectedEOF {
+				t.Errorf("NewReader(%d) on truncated stream: got %v, want %v", i, err, io.ErrUnexpectedEOF)
+			}
+			continue
+		}
+		_, err = io.Copy(ioutil.Discard, r)
+		if ferr, ok := err.(*flate.ReadError); ok {
+			err = ferr.Err
+		}
+		if err != io.ErrUnexpectedEOF {
+			t.Errorf("io.Copy(%d) on truncated stream: got %v, want %v", i, err, io.ErrUnexpectedEOF)
+		}
+	}
+}
+
+func BenchmarkGunzipCopy(b *testing.B) {
+	dat, _ := ioutil.ReadFile("testdata/test.json")
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dst := &bytes.Buffer{}
+	w, _ := NewWriterLevel(dst, 1)
+	_, err := w.Write(dat)
+	if err != nil {
+		b.Fatal(err)
+	}
+	w.Close()
+	input := dst.Bytes()
+	b.SetBytes(int64(len(dat)))
+	b.ResetTimer()
+	for n := 0; n < b.N; n++ {
+		r, err := NewReader(bytes.NewBuffer(input))
+		if err != nil {
+			b.Fatal(err)
+		}
+		_, err = io.Copy(ioutil.Discard, r)
+		if err != nil {
+			b.Fatal(err)
+		}
+	}
+}
+
+func BenchmarkGunzipNoWriteTo(b *testing.B) {
+	dat, _ := ioutil.ReadFile("testdata/test.json")
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dst := &bytes.Buffer{}
+	w, _ := NewWriterLevel(dst, 1)
+	_, err := w.Write(dat)
+	if err != nil {
+		b.Fatal(err)
+	}
+	w.Close()
+	input := dst.Bytes()
+	r, err := NewReader(bytes.NewBuffer(input))
+	if err != nil {
+		b.Fatal(err)
+	}
+	b.SetBytes(int64(len(dat)))
+	b.ResetTimer()
+	for n := 0; n < b.N; n++ {
+		err := r.Reset(bytes.NewBuffer(input))
+		if err != nil {
+			b.Fatal(err)
+		}
+		_, err = io.Copy(ioutil.Discard, ioutil.NopCloser(r))
+		if err != nil {
+			b.Fatal(err)
+		}
+	}
+}
+
+func BenchmarkGunzipStdlib(b *testing.B) {
+	dat, _ := ioutil.ReadFile("testdata/test.json")
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dst := &bytes.Buffer{}
+	w, _ := NewWriterLevel(dst, 1)
+	_, err := w.Write(dat)
+	if err != nil {
+		b.Fatal(err)
+	}
+	w.Close()
+	input := dst.Bytes()
+	r, err := oldgz.NewReader(bytes.NewBuffer(input))
+	if err != nil {
+		b.Fatal(err)
+	}
+	b.SetBytes(int64(len(dat)))
+	b.ResetTimer()
+	for n := 0; n < b.N; n++ {
+		err := r.Reset(bytes.NewBuffer(input))
+		if err != nil {
+			b.Fatal(err)
+		}
+		_, err = io.Copy(ioutil.Discard, r)
+		if err != nil {
+			b.Fatal(err)
+		}
+	}
+}
--- a/vendor/github.com/klauspost/compress/gzip/gzip.go
+++ b/vendor/github.com/klauspost/compress/gzip/gzip.go
@@ -0,0 +1,251 @@
+// Copyright 2010 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package gzip
+
+import (
+	"errors"
+	"fmt"
+	"io"
+
+	"github.com/klauspost/compress/flate"
+	"github.com/klauspost/crc32"
+)
+
+// These constants are copied from the flate package, so that code that imports
+// "compress/gzip" does not also have to import "compress/flate".
+const (
+	NoCompression       = flate.NoCompression
+	BestSpeed           = flate.BestSpeed
+	BestCompression     = flate.BestCompression
+	DefaultCompression  = flate.DefaultCompression
+	ConstantCompression = flate.ConstantCompression
+	HuffmanOnly         = flate.HuffmanOnly
+)
+
+// A Writer is an io.WriteCloser.
+// Writes to a Writer are compressed and written to w.
+type Writer struct {
+	Header      // written at first call to Write, Flush, or Close
+	w           io.Writer
+	level       int
+	wroteHeader bool
+	compressor  *flate.Writer
+	digest      uint32 // CRC-32, IEEE polynomial (section 8)
+	size        uint32 // Uncompressed size (section 2.3.1)
+	closed      bool
+	buf         [10]byte
+	err         error
+}
+
+// NewWriter returns a new Writer.
+// Writes to the returned writer are compressed and written to w.
+//
+// It is the caller's responsibility to call Close on the WriteCloser when done.
+// Writes may be buffered and not flushed until Close.
+//
+// Callers that wish to set the fields in Writer.Header must do so before
+// the first call to Write, Flush, or Close.
+func NewWriter(w io.Writer) *Writer {
+	z, _ := NewWriterLevel(w, DefaultCompression)
+	return z
+}
+
+// NewWriterLevel is like NewWriter but specifies the compression level instead
+// of assuming DefaultCompression.
+//
+// The compression level can be DefaultCompression, NoCompression, or any
+// integer value between BestSpeed and BestCompression inclusive. The error
+// returned will be nil if the level is valid.
+func NewWriterLevel(w io.Writer, level int) (*Writer, error) {
+	if level < HuffmanOnly || level > BestCompression {
+		return nil, fmt.Errorf("gzip: invalid compression level: %d", level)
+	}
+	z := new(Writer)
+	z.init(w, level)
+	return z, nil
+}
+
+func (z *Writer) init(w io.Writer, level int) {
+	compressor := z.compressor
+	if compressor != nil {
+		compressor.Reset(w)
+	}
+	*z = Writer{
+		Header: Header{
+			OS: 255, // unknown
+		},
+		w:          w,
+		level:      level,
+		compressor: compressor,
+	}
+}
+
+// Reset discards the Writer z's state and makes it equivalent to the
+// result of its original state from NewWriter or NewWriterLevel, but
+// writing to w instead. This permits reusing a Writer rather than
+// allocating a new one.
+func (z *Writer) Reset(w io.Writer) {
+	z.init(w, z.level)
+}
+
+// writeBytes writes a length-prefixed byte slice to z.w.
+func (z *Writer) writeBytes(b []byte) error {
+	if len(b) > 0xffff {
+		return errors.New("gzip.Write: Extra data is too large")
+	}
+	le.PutUint16(z.buf[:2], uint16(len(b)))
+	_, err := z.w.Write(z.buf[:2])
+	if err != nil {
+		return err
+	}
+	_, err = z.w.Write(b)
+	return err
+}
+
+// writeString writes a UTF-8 string s in GZIP's format to z.w.
+// GZIP (RFC 1952) specifies that strings are NUL-terminated ISO 8859-1 (Latin-1).
+func (z *Writer) writeString(s string) (err error) {
+	// GZIP stores Latin-1 strings; error if non-Latin-1; convert if non-ASCII.
+	needconv := false
+	for _, v := range s {
+		if v == 0 || v > 0xff {
+			return errors.New("gzip.Write: non-Latin-1 header string")
+		}
+		if v > 0x7f {
+			needconv = true
+		}
+	}
+	if needconv {
+		b := make([]byte, 0, len(s))
+		for _, v := range s {
+			b = append(b, byte(v))
+		}
+		_, err = z.w.Write(b)
+	} else {
+		_, err = io.WriteString(z.w, s)
+	}
+	if err != nil {
+		return err
+	}
+	// GZIP strings are NUL-terminated.
+	z.buf[0] = 0
+	_, err = z.w.Write(z.buf[:1])
+	return err
+}
+
+// Write writes a compressed form of p to the underlying io.Writer. The
+// compressed bytes are not necessarily flushed until the Writer is closed.
+func (z *Writer) Write(p []byte) (int, error) {
+	if z.err != nil {
+		return 0, z.err
+	}
+	var n int
+	// Write the GZIP header lazily.
+	if !z.wroteHeader {
+		z.wroteHeader = true
+		z.buf[0] = gzipID1
+		z.buf[1] = gzipID2
+		z.buf[2] = gzipDeflate
+		z.buf[3] = 0
+		if z.Extra != nil {
+			z.buf[3] |= 0x04
+		}
+		if z.Name != "" {
+			z.buf[3] |= 0x08
+		}
+		if z.Comment != "" {
+			z.buf[3] |= 0x10
+		}
+		le.PutUint32(z.buf[4:8], uint32(z.ModTime.Unix()))
+		if z.level == BestCompression {
+			z.buf[8] = 2
+		} else if z.level == BestSpeed {
+			z.buf[8] = 4
+		} else {
+			z.buf[8] = 0
+		}
+		z.buf[9] = z.OS
+		n, z.err = z.w.Write(z.buf[:10])
+		if z.err != nil {
+			return n, z.err
+		}
+		if z.Extra != nil {
+			z.err = z.writeBytes(z.Extra)
+			if z.err != nil {
+				return n, z.err
+			}
+		}
+		if z.Name != "" {
+			z.err = z.writeString(z.Name)
+			if z.err != nil {
+				return n, z.err
+			}
+		}
+		if z.Comment != "" {
+			z.err = z.writeString(z.Comment)
+			if z.err != nil {
+				return n, z.err
+			}
+		}
+		if z.compressor == nil {
+			z.compressor, _ = flate.NewWriter(z.w, z.level)
+		}
+	}
+	z.size += uint32(len(p))
+	z.digest = crc32.Update(z.digest, crc32.IEEETable, p)
+	n, z.err = z.compressor.Write(p)
+	return n, z.err
+}
+
+// Flush flushes any pending compressed data to the underlying writer.
+//
+// It is useful mainly in compressed network protocols, to ensure that
+// a remote reader has enough data to reconstruct a packet. Flush does
+// not return until the data has been written. If the underlying
+// writer returns an error, Flush returns that error.
+//
+// In the terminology of the zlib library, Flush is equivalent to Z_SYNC_FLUSH.
+func (z *Writer) Flush() error {
+	if z.err != nil {
+		return z.err
+	}
+	if z.closed {
+		return nil
+	}
+	if !z.wroteHeader {
+		z.Write(nil)
+		if z.err != nil {
+			return z.err
+		}
+	}
+	z.err = z.compressor.Flush()
+	return z.err
+}
+
+// Close closes the Writer, flushing any unwritten data to the underlying
+// io.Writer, but does not close the underlying io.Writer.
+func (z *Writer) Close() error {
+	if z.err != nil {
+		return z.err
+	}
+	if z.closed {
+		return nil
+	}
+	z.closed = true
+	if !z.wroteHeader {
+		z.Write(nil)
+		if z.err != nil {
+			return z.err
+		}
+	}
+	z.err = z.compressor.Close()
+	if z.err != nil {
+		return z.err
+	}
+	le.PutUint32(z.buf[:4], z.digest)
+	le.PutUint32(z.buf[4:8], z.size)
+	_, z.err = z.w.Write(z.buf[:8])
+	return z.err
+}
--- a/vendor/github.com/klauspost/compress/gzip/gzip_test.go
+++ b/vendor/github.com/klauspost/compress/gzip/gzip_test.go
@@ -0,0 +1,519 @@
+// Copyright 2010 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package gzip
+
+import (
+	"bufio"
+	"bytes"
+	oldgz "compress/gzip"
+	"io"
+	"io/ioutil"
+	"math/rand"
+	"testing"
+	"time"
+)
+
+// TestEmpty tests that an empty payload still forms a valid GZIP stream.
+func TestEmpty(t *testing.T) {
+	buf := new(bytes.Buffer)
+
+	if err := NewWriter(buf).Close(); err != nil {
+		t.Fatalf("Writer.Close: %v", err)
+	}
+
+	r, err := NewReader(buf)
+	if err != nil {
+		t.Fatalf("NewReader: %v", err)
+	}
+	b, err := ioutil.ReadAll(r)
+	if err != nil {
+		t.Fatalf("ReadAll: %v", err)
+	}
+	if len(b) != 0 {
+		t.Fatalf("got %d bytes, want 0", len(b))
+	}
+	if err := r.Close(); err != nil {
+		t.Fatalf("Reader.Close: %v", err)
+	}
+}
+
+// TestRoundTrip tests that gzipping and then gunzipping is the identity
+// function.
+func TestRoundTrip(t *testing.T) {
+	buf := new(bytes.Buffer)
+
+	w := NewWriter(buf)
+	w.Comment = "comment"
+	w.Extra = []byte("extra")
+	w.ModTime = time.Unix(1e8, 0)
+	w.Name = "name"
+	if _, err := w.Write([]byte("payload")); err != nil {
+		t.Fatalf("Write: %v", err)
+	}
+	if err := w.Close(); err != nil {
+		t.Fatalf("Writer.Close: %v", err)
+	}
+
+	r, err := NewReader(buf)
+	if err != nil {
+		t.Fatalf("NewReader: %v", err)
+	}
+	b, err := ioutil.ReadAll(r)
+	if err != nil {
+		t.Fatalf("ReadAll: %v", err)
+	}
+	if string(b) != "payload" {
+		t.Fatalf("payload is %q, want %q", string(b), "payload")
+	}
+	if r.Comment != "comment" {
+		t.Fatalf("comment is %q, want %q", r.Comment, "comment")
+	}
+	if string(r.Extra) != "extra" {
+		t.Fatalf("extra is %q, want %q", r.Extra, "extra")
+	}
+	if r.ModTime.Unix() != 1e8 {
+		t.Fatalf("mtime is %d, want %d", r.ModTime.Unix(), uint32(1e8))
+	}
+	if r.Name != "name" {
+		t.Fatalf("name is %q, want %q", r.Name, "name")
+	}
+	if err := r.Close(); err != nil {
+		t.Fatalf("Reader.Close: %v", err)
+	}
+}
+
+// TestLatin1 tests the internal functions for converting to and from Latin-1.
+func TestLatin1(t *testing.T) {
+	latin1 := []byte{0xc4, 'u', 0xdf, 'e', 'r', 'u', 'n', 'g', 0}
+	utf8 := "Äußerung"
+	z := Reader{r: bufio.NewReader(bytes.NewReader(latin1))}
+	s, err := z.readString()
+	if err != nil {
+		t.Fatalf("readString: %v", err)
+	}
+	if s != utf8 {
+		t.Fatalf("read latin-1: got %q, want %q", s, utf8)
+	}
+
+	buf := bytes.NewBuffer(make([]byte, 0, len(latin1)))
+	c := Writer{w: buf}
+	if err = c.writeString(utf8); err != nil {
+		t.Fatalf("writeString: %v", err)
+	}
+	s = buf.String()
+	if s != string(latin1) {
+		t.Fatalf("write utf-8: got %q, want %q", s, string(latin1))
+	}
+}
+
+// TestLatin1RoundTrip tests that metadata that is representable in Latin-1
+// survives a round trip.
+func TestLatin1RoundTrip(t *testing.T) {
+	testCases := []struct {
+		name string
+		ok   bool
+	}{
+		{"", true},
+		{"ASCII is OK", true},
+		{"unless it contains a NUL\x00", false},
+		{"no matter where \x00 occurs", false},
+		{"\x00\x00\x00", false},
+		{"Látin-1 also passes (U+00E1)", true},
+		{"but LĀtin Extended-A (U+0100) does not", false},
+		{"neither does 日本語", false},
+		{"invalid UTF-8 also \xffails", false},
+		{"\x00 as does Látin-1 with NUL", false},
+	}
+	for _, tc := range testCases {
+		buf := new(bytes.Buffer)
+
+		w := NewWriter(buf)
+		w.Name = tc.name
+		err := w.Close()
+		if (err == nil) != tc.ok {
+			t.Errorf("Writer.Close: name = %q, err = %v", tc.name, err)
+			continue
+		}
+		if !tc.ok {
+			continue
+		}
+
+		r, err := NewReader(buf)
+		if err != nil {
+			t.Errorf("NewReader: %v", err)
+			continue
+		}
+		_, err = ioutil.ReadAll(r)
+		if err != nil {
+			t.Errorf("ReadAll: %v", err)
+			continue
+		}
+		if r.Name != tc.name {
+			t.Errorf("name is %q, want %q", r.Name, tc.name)
+			continue
+		}
+		if err := r.Close(); err != nil {
+			t.Errorf("Reader.Close: %v", err)
+			continue
+		}
+	}
+}
+
+func TestWriterFlush(t *testing.T) {
+	buf := new(bytes.Buffer)
+
+	w := NewWriter(buf)
+	w.Comment = "comment"
+	w.Extra = []byte("extra")
+	w.ModTime = time.Unix(1e8, 0)
+	w.Name = "name"
+
+	n0 := buf.Len()
+	if n0 != 0 {
+		t.Fatalf("buffer size = %d before writes; want 0", n0)
+	}
+
+	if err := w.Flush(); err != nil {
+		t.Fatal(err)
+	}
+
+	n1 := buf.Len()
+	if n1 == 0 {
+		t.Fatal("no data after first flush")
+	}
+
+	w.Write([]byte("x"))
+
+	n2 := buf.Len()
+	if n1 != n2 {
+		t.Fatalf("after writing a single byte, size changed from %d to %d; want no change", n1, n2)
+	}
+
+	if err := w.Flush(); err != nil {
+		t.Fatal(err)
+	}
+
+	n3 := buf.Len()
+	if n2 == n3 {
+		t.Fatal("Flush didn't flush any data")
+	}
+}
+
+// Multiple gzip files concatenated form a valid gzip file.
+func TestConcat(t *testing.T) {
+	var buf bytes.Buffer
+	w := NewWriter(&buf)
+	w.Write([]byte("hello "))
+	w.Close()
+	w = NewWriter(&buf)
+	w.Write([]byte("world\n"))
+	w.Close()
+
+	r, err := NewReader(&buf)
+	data, err := ioutil.ReadAll(r)
+	if string(data) != "hello world\n" || err != nil {
+		t.Fatalf("ReadAll = %q, %v, want %q, nil", data, err, "hello world")
+	}
+}
+
+func TestWriterReset(t *testing.T) {
+	buf := new(bytes.Buffer)
+	buf2 := new(bytes.Buffer)
+	z := NewWriter(buf)
+	msg := []byte("hello world")
+	z.Write(msg)
+	z.Close()
+	z.Reset(buf2)
+	z.Write(msg)
+	z.Close()
+	if buf.String() != buf2.String() {
+		t.Errorf("buf2 %q != original buf of %q", buf2.String(), buf.String())
+	}
+}
+
+var testbuf []byte
+
+func testFile(i, level int, t *testing.T) {
+	dat, _ := ioutil.ReadFile("testdata/test.json")
+	dl := len(dat)
+	if len(testbuf) != i*dl {
+		// Make results predictable
+		testbuf = make([]byte, i*dl)
+		for j := 0; j < i; j++ {
+			copy(testbuf[j*dl:j*dl+dl], dat)
+		}
+	}
+
+	br := bytes.NewBuffer(testbuf)
+	var buf bytes.Buffer
+	w, err := NewWriterLevel(&buf, DefaultCompression)
+	if err != nil {
+		t.Fatal(err)
+	}
+	n, err := io.Copy(w, br)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if int(n) != len(testbuf) {
+		t.Fatal("Short write:", n, "!=", testbuf)
+	}
+	err = w.Close()
+	if err != nil {
+		t.Fatal(err)
+	}
+	r, err := NewReader(&buf)
+	if err != nil {
+		t.Fatal(err.Error())
+	}
+	decoded, err := ioutil.ReadAll(r)
+	if err != nil {
+		t.Fatal(err.Error())
+	}
+	if !bytes.Equal(testbuf, decoded) {
+		t.Errorf("decoded content does not match.")
+	}
+}
+
+func TestFile1xM2(t *testing.T) { testFile(1, -2, t) }
+func TestFile1xM1(t *testing.T) { testFile(1, -1, t) }
+func TestFile1x0(t *testing.T)  { testFile(1, 0, t) }
+func TestFile1x1(t *testing.T)  { testFile(1, 1, t) }
+func TestFile1x2(t *testing.T)  { testFile(1, 2, t) }
+func TestFile1x3(t *testing.T)  { testFile(1, 3, t) }
+func TestFile1x4(t *testing.T)  { testFile(1, 4, t) }
+func TestFile1x5(t *testing.T)  { testFile(1, 5, t) }
+func TestFile1x6(t *testing.T)  { testFile(1, 6, t) }
+func TestFile1x7(t *testing.T)  { testFile(1, 7, t) }
+func TestFile1x8(t *testing.T)  { testFile(1, 8, t) }
+func TestFile1x9(t *testing.T)  { testFile(1, 9, t) }
+func TestFile10(t *testing.T)   { testFile(10, DefaultCompression, t) }
+
+func TestFile50(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping during short test")
+	}
+	testFile(50, DefaultCompression, t)
+}
+
+func TestFile200(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping during short test")
+	}
+	testFile(200, BestSpeed, t)
+}
+
+func testBigGzip(i int, t *testing.T) {
+	if len(testbuf) != i {
+		// Make results predictable
+		rand.Seed(1337)
+		testbuf = make([]byte, i)
+		for idx := range testbuf {
+			testbuf[idx] = byte(65 + rand.Intn(20))
+		}
+	}
+	c := BestCompression
+	if testing.Short() {
+		c = BestSpeed
+	}
+
+	br := bytes.NewBuffer(testbuf)
+	var buf bytes.Buffer
+	w, err := NewWriterLevel(&buf, c)
+	if err != nil {
+		t.Fatal(err)
+	}
+	n, err := io.Copy(w, br)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if int(n) != len(testbuf) {
+		t.Fatal("Short write:", n, "!=", len(testbuf))
+	}
+	err = w.Close()
+	if err != nil {
+		t.Fatal(err.Error())
+	}
+
+	r, err := NewReader(&buf)
+	if err != nil {
+		t.Fatal(err.Error())
+	}
+	decoded, err := ioutil.ReadAll(r)
+	if err != nil {
+		t.Fatal(err.Error())
+	}
+	if !bytes.Equal(testbuf, decoded) {
+		t.Errorf("decoded content does not match.")
+	}
+}
+
+func TestGzip1K(t *testing.T)   { testBigGzip(1000, t) }
+func TestGzip100K(t *testing.T) { testBigGzip(100000, t) }
+func TestGzip1M(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping during short test")
+	}
+
+	testBigGzip(1000000, t)
+}
+func TestGzip10M(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping during short test")
+	}
+	testBigGzip(10000000, t)
+}
+
+// Test if two runs produce identical results.
+func TestDeterministicLM2(t *testing.T) { testDeterm(-2, t) }
+
+// Level 0 is not deterministic since it depends on the size of each write.
+// func TestDeterministicL0(t *testing.T)  { testDeterm(0, t) }
+func TestDeterministicL1(t *testing.T) { testDeterm(1, t) }
+func TestDeterministicL2(t *testing.T) { testDeterm(2, t) }
+func TestDeterministicL3(t *testing.T) { testDeterm(3, t) }
+func TestDeterministicL4(t *testing.T) { testDeterm(4, t) }
+func TestDeterministicL5(t *testing.T) { testDeterm(5, t) }
+func TestDeterministicL6(t *testing.T) { testDeterm(6, t) }
+func TestDeterministicL7(t *testing.T) { testDeterm(7, t) }
+func TestDeterministicL8(t *testing.T) { testDeterm(8, t) }
+func TestDeterministicL9(t *testing.T) { testDeterm(9, t) }
+
+func testDeterm(i int, t *testing.T) {
+	var length = 500000
+	if testing.Short() {
+		length = 100000
+	}
+	rand.Seed(1337)
+	t1 := make([]byte, length)
+	for idx := range t1 {
+		t1[idx] = byte(65 + rand.Intn(8))
+	}
+
+	br := bytes.NewBuffer(t1)
+	var b1 bytes.Buffer
+	w, err := NewWriterLevel(&b1, i)
+	if err != nil {
+		t.Fatal(err)
+	}
+	_, err = io.Copy(w, br)
+	if err != nil {
+		t.Fatal(err)
+	}
+	w.Flush()
+	w.Close()
+
+	// We recreate the buffer, so we have a goos chance of getting a
+	// different memory address.
+	rand.Seed(1337)
+	t2 := make([]byte, length)
+	for idx := range t2 {
+		t2[idx] = byte(65 + rand.Intn(8))
+	}
+
+	br2 := bytes.NewBuffer(t2)
+	var b2 bytes.Buffer
+	w2, err := NewWriterLevel(&b2, i)
+	if err != nil {
+		t.Fatal(err)
+	}
+
+	// We write the same data, but with a different size than
+	// the default copy.
+	for {
+		_, err = io.CopyN(w2, br2, 1234)
+		if err == io.EOF {
+			err = nil
+			break
+		} else if err != nil {
+			break
+		}
+	}
+	if err != nil {
+		t.Fatal(err)
+	}
+	w2.Flush()
+	w2.Close()
+
+	b1b := b1.Bytes()
+	b2b := b2.Bytes()
+
+	if bytes.Compare(b1b, b2b) != 0 {
+		t.Fatalf("Level %d did not produce deterministric result, len(a) = %d, len(b) = %d", i, len(b1b), len(b2b))
+	}
+}
+
+func BenchmarkGzipLM2(b *testing.B) { benchmarkGzipN(b, -2) }
+func BenchmarkGzipL1(b *testing.B)  { benchmarkGzipN(b, 1) }
+func BenchmarkGzipL2(b *testing.B)  { benchmarkGzipN(b, 2) }
+func BenchmarkGzipL3(b *testing.B)  { benchmarkGzipN(b, 3) }
+func BenchmarkGzipL4(b *testing.B)  { benchmarkGzipN(b, 4) }
+func BenchmarkGzipL5(b *testing.B)  { benchmarkGzipN(b, 5) }
+func BenchmarkGzipL6(b *testing.B)  { benchmarkGzipN(b, 6) }
+func BenchmarkGzipL7(b *testing.B)  { benchmarkGzipN(b, 7) }
+func BenchmarkGzipL8(b *testing.B)  { benchmarkGzipN(b, 8) }
+func BenchmarkGzipL9(b *testing.B)  { benchmarkGzipN(b, 9) }
+
+func benchmarkGzipN(b *testing.B, level int) {
+	dat, _ := ioutil.ReadFile("testdata/test.json")
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	b.SetBytes(int64(len(dat)))
+	w, _ := NewWriterLevel(ioutil.Discard, level)
+	b.ResetTimer()
+	for n := 0; n < b.N; n++ {
+		w.Reset(ioutil.Discard)
+		n, err := w.Write(dat)
+		if n != len(dat) {
+			panic("short write")
+		}
+		if err != nil {
+			panic(err)
+		}
+		err = w.Close()
+		if err != nil {
+			panic(err)
+		}
+	}
+}
+
+func BenchmarkOldGzipL1(b *testing.B) { benchmarkOldGzipN(b, 1) }
+func BenchmarkOldGzipL2(b *testing.B) { benchmarkOldGzipN(b, 2) }
+func BenchmarkOldGzipL3(b *testing.B) { benchmarkOldGzipN(b, 3) }
+func BenchmarkOldGzipL4(b *testing.B) { benchmarkOldGzipN(b, 4) }
+func BenchmarkOldGzipL5(b *testing.B) { benchmarkOldGzipN(b, 5) }
+func BenchmarkOldGzipL6(b *testing.B) { benchmarkOldGzipN(b, 6) }
+func BenchmarkOldGzipL7(b *testing.B) { benchmarkOldGzipN(b, 7) }
+func BenchmarkOldGzipL8(b *testing.B) { benchmarkOldGzipN(b, 8) }
+func BenchmarkOldGzipL9(b *testing.B) { benchmarkOldGzipN(b, 9) }
+
+func benchmarkOldGzipN(b *testing.B, level int) {
+	dat, _ := ioutil.ReadFile("testdata/test.json")
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+	dat = append(dat, dat...)
+
+	b.SetBytes(int64(len(dat)))
+	w, _ := oldgz.NewWriterLevel(ioutil.Discard, level)
+	b.ResetTimer()
+	for n := 0; n < b.N; n++ {
+		w.Reset(ioutil.Discard)
+		n, err := w.Write(dat)
+		if n != len(dat) {
+			panic("short write")
+		}
+		if err != nil {
+			panic(err)
+		}
+		err = w.Close()
+		if err != nil {
+			panic(err)
+		}
+	}
+}
--- a/vendor/github.com/klauspost/compress/gzip/testdata/issue6550.gz
+++ b/vendor/github.com/klauspost/compress/gzip/testdata/issue6550.gz
--- a/vendor/github.com/klauspost/compress/gzip/testdata/test.json
+++ b/vendor/github.com/klauspost/compress/gzip/testdata/test.json
--- a/vendor/github.com/klauspost/compress/snappy/.gitignore
+++ b/vendor/github.com/klauspost/compress/snappy/.gitignore
@@ -0,0 +1,16 @@
+cmd/snappytool/snappytool
+testdata/bench
+
+# These explicitly listed benchmark data files are for an obsolete version of
+# snappy_test.go.
+testdata/alice29.txt
+testdata/asyoulik.txt
+testdata/fireworks.jpeg
+testdata/geo.protodata
+testdata/html
+testdata/html_x_4
+testdata/kppkn.gtb
+testdata/lcet10.txt
+testdata/paper-100k.pdf
+testdata/plrabn12.txt
+testdata/urls.10K
--- a/vendor/github.com/klauspost/compress/snappy/AUTHORS
+++ b/vendor/github.com/klauspost/compress/snappy/AUTHORS
@@ -0,0 +1,15 @@
+# This is the official list of Snappy-Go authors for copyright purposes.
+# This file is distinct from the CONTRIBUTORS files.
+# See the latter for an explanation.
+
+# Names should be added to this file as
+#	Name or Organization <email address>
+# The email address is not required for organizations.
+
+# Please keep the list sorted.
+
+Damian Gryski <dgryski@gmail.com>
+Google Inc.
+Jan Mercl <0xjnml@gmail.com>
+Rodolfo Carvalho <rhcarvalho@gmail.com>
+Sebastien Binet <seb.binet@gmail.com>
--- a/vendor/github.com/klauspost/compress/snappy/CONTRIBUTORS
+++ b/vendor/github.com/klauspost/compress/snappy/CONTRIBUTORS
@@ -0,0 +1,37 @@
+# This is the official list of people who can contribute
+# (and typically have contributed) code to the Snappy-Go repository.
+# The AUTHORS file lists the copyright holders; this file
+# lists people.  For example, Google employees are listed here
+# but not in AUTHORS, because Google holds the copyright.
+#
+# The submission process automatically checks to make sure
+# that people submitting code are listed in this file (by email address).
+#
+# Names should be added to this file only after verifying that
+# the individual or the individual's organization has agreed to
+# the appropriate Contributor License Agreement, found here:
+#
+#     http://code.google.com/legal/individual-cla-v1.0.html
+#     http://code.google.com/legal/corporate-cla-v1.0.html
+#
+# The agreement for individuals can be filled out on the web.
+#
+# When adding J Random Contributor's name to this file,
+# either J's name or J's organization's name should be
+# added to the AUTHORS file, depending on whether the
+# individual or corporate CLA was used.
+
+# Names should be added to this file like so:
+#     Name <email address>
+
+# Please keep the list sorted.
+
+Damian Gryski <dgryski@gmail.com>
+Jan Mercl <0xjnml@gmail.com>
+Kai Backman <kaib@golang.org>
+Marc-Antoine Ruel <maruel@chromium.org>
+Nigel Tao <nigeltao@golang.org>
+Rob Pike <r@golang.org>
+Rodolfo Carvalho <rhcarvalho@gmail.com>
+Russ Cox <rsc@golang.org>
+Sebastien Binet <seb.binet@gmail.com>
--- a/vendor/github.com/klauspost/compress/snappy/LICENSE
+++ b/vendor/github.com/klauspost/compress/snappy/LICENSE
@@ -0,0 +1,27 @@
+Copyright (c) 2011 The Snappy-Go Authors. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+   * Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+   * Redistributions in binary form must reproduce the above
+copyright notice, this list of conditions and the following disclaimer
+in the documentation and/or other materials provided with the
+distribution.
+   * Neither the name of Google Inc. nor the names of its
+contributors may be used to endorse or promote products derived from
+this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/vendor/github.com/klauspost/compress/snappy/README
+++ b/vendor/github.com/klauspost/compress/snappy/README
@@ -0,0 +1,107 @@
+The Snappy compression format in the Go programming language.
+
+To download and install from source:
+$ go get github.com/golang/snappy
+
+Unless otherwise noted, the Snappy-Go source files are distributed
+under the BSD-style license found in the LICENSE file.
+
+
+
+Benchmarks.
+
+The golang/snappy benchmarks include compressing (Z) and decompressing (U) ten
+or so files, the same set used by the C++ Snappy code (github.com/google/snappy
+and note the "google", not "golang"). On an "Intel(R) Core(TM) i7-3770 CPU @
+3.40GHz", Go's GOARCH=amd64 numbers as of 2016-05-29:
+
+"go test -test.bench=."
+
+_UFlat0-8         2.19GB/s ± 0%  html
+_UFlat1-8         1.41GB/s ± 0%  urls
+_UFlat2-8         23.5GB/s ± 2%  jpg
+_UFlat3-8         1.91GB/s ± 0%  jpg_200
+_UFlat4-8         14.0GB/s ± 1%  pdf
+_UFlat5-8         1.97GB/s ± 0%  html4
+_UFlat6-8          814MB/s ± 0%  txt1
+_UFlat7-8          785MB/s ± 0%  txt2
+_UFlat8-8          857MB/s ± 0%  txt3
+_UFlat9-8          719MB/s ± 1%  txt4
+_UFlat10-8        2.84GB/s ± 0%  pb
+_UFlat11-8        1.05GB/s ± 0%  gaviota
+
+_ZFlat0-8         1.04GB/s ± 0%  html
+_ZFlat1-8          534MB/s ± 0%  urls
+_ZFlat2-8         15.7GB/s ± 1%  jpg
+_ZFlat3-8          740MB/s ± 3%  jpg_200
+_ZFlat4-8         9.20GB/s ± 1%  pdf
+_ZFlat5-8          991MB/s ± 0%  html4
+_ZFlat6-8          379MB/s ± 0%  txt1
+_ZFlat7-8          352MB/s ± 0%  txt2
+_ZFlat8-8          396MB/s ± 1%  txt3
+_ZFlat9-8          327MB/s ± 1%  txt4
+_ZFlat10-8        1.33GB/s ± 1%  pb
+_ZFlat11-8         605MB/s ± 1%  gaviota
+
+
+
+"go test -test.bench=. -tags=noasm"
+
+_UFlat0-8          621MB/s ± 2%  html
+_UFlat1-8          494MB/s ± 1%  urls
+_UFlat2-8         23.2GB/s ± 1%  jpg
+_UFlat3-8         1.12GB/s ± 1%  jpg_200
+_UFlat4-8         4.35GB/s ± 1%  pdf
+_UFlat5-8          609MB/s ± 0%  html4
+_UFlat6-8          296MB/s ± 0%  txt1
+_UFlat7-8          288MB/s ± 0%  txt2
+_UFlat8-8          309MB/s ± 1%  txt3
+_UFlat9-8          280MB/s ± 1%  txt4
+_UFlat10-8         753MB/s ± 0%  pb
+_UFlat11-8         400MB/s ± 0%  gaviota
+
+_ZFlat0-8          409MB/s ± 1%  html
+_ZFlat1-8          250MB/s ± 1%  urls
+_ZFlat2-8         12.3GB/s ± 1%  jpg
+_ZFlat3-8          132MB/s ± 0%  jpg_200
+_ZFlat4-8         2.92GB/s ± 0%  pdf
+_ZFlat5-8          405MB/s ± 1%  html4
+_ZFlat6-8          179MB/s ± 1%  txt1
+_ZFlat7-8          170MB/s ± 1%  txt2
+_ZFlat8-8          189MB/s ± 1%  txt3
+_ZFlat9-8          164MB/s ± 1%  txt4
+_ZFlat10-8         479MB/s ± 1%  pb
+_ZFlat11-8         270MB/s ± 1%  gaviota
+
+
+
+For comparison (Go's encoded output is byte-for-byte identical to C++'s), here
+are the numbers from C++ Snappy's
+
+make CXXFLAGS="-O2 -DNDEBUG -g" clean snappy_unittest.log && cat snappy_unittest.log
+
+BM_UFlat/0     2.4GB/s  html
+BM_UFlat/1     1.4GB/s  urls
+BM_UFlat/2    21.8GB/s  jpg
+BM_UFlat/3     1.5GB/s  jpg_200
+BM_UFlat/4    13.3GB/s  pdf
+BM_UFlat/5     2.1GB/s  html4
+BM_UFlat/6     1.0GB/s  txt1
+BM_UFlat/7   959.4MB/s  txt2
+BM_UFlat/8     1.0GB/s  txt3
+BM_UFlat/9   864.5MB/s  txt4
+BM_UFlat/10    2.9GB/s  pb
+BM_UFlat/11    1.2GB/s  gaviota
+
+BM_ZFlat/0   944.3MB/s  html (22.31 %)
+BM_ZFlat/1   501.6MB/s  urls (47.78 %)
+BM_ZFlat/2    14.3GB/s  jpg (99.95 %)
+BM_ZFlat/3   538.3MB/s  jpg_200 (73.00 %)
+BM_ZFlat/4     8.3GB/s  pdf (83.30 %)
+BM_ZFlat/5   903.5MB/s  html4 (22.52 %)
+BM_ZFlat/6   336.0MB/s  txt1 (57.88 %)
+BM_ZFlat/7   312.3MB/s  txt2 (61.91 %)
+BM_ZFlat/8   353.1MB/s  txt3 (54.99 %)
+BM_ZFlat/9   289.9MB/s  txt4 (66.26 %)
+BM_ZFlat/10    1.2GB/s  pb (19.68 %)
+BM_ZFlat/11  527.4MB/s  gaviota (37.72 %)
--- a/vendor/github.com/klauspost/compress/snappy/cmd/snappytool/main.cpp
+++ b/vendor/github.com/klauspost/compress/snappy/cmd/snappytool/main.cpp
@@ -0,0 +1,77 @@
+/*
+To build the snappytool binary:
+g++ main.cpp /usr/lib/libsnappy.a -o snappytool
+or, if you have built the C++ snappy library from source:
+g++ main.cpp /path/to/your/snappy/.libs/libsnappy.a -o snappytool
+after running "make" from your snappy checkout directory.
+*/
+
+#include <errno.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "snappy.h"
+
+#define N 1000000
+
+char dst[N];
+char src[N];
+
+int main(int argc, char** argv) {
+  // Parse args.
+  if (argc != 2) {
+    fprintf(stderr, "exactly one of -d or -e must be given\n");
+    return 1;
+  }
+  bool decode = strcmp(argv[1], "-d") == 0;
+  bool encode = strcmp(argv[1], "-e") == 0;
+  if (decode == encode) {
+    fprintf(stderr, "exactly one of -d or -e must be given\n");
+    return 1;
+  }
+
+  // Read all of stdin into src[:s].
+  size_t s = 0;
+  while (1) {
+    if (s == N) {
+      fprintf(stderr, "input too large\n");
+      return 1;
+    }
+    ssize_t n = read(0, src+s, N-s);
+    if (n == 0) {
+      break;
+    }
+    if (n < 0) {
+      fprintf(stderr, "read error: %s\n", strerror(errno));
+      // TODO: handle EAGAIN, EINTR?
+      return 1;
+    }
+    s += n;
+  }
+
+  // Encode or decode src[:s] to dst[:d], and write to stdout.
+  size_t d = 0;
+  if (encode) {
+    if (N < snappy::MaxCompressedLength(s)) {
+      fprintf(stderr, "input too large after encoding\n");
+      return 1;
+    }
+    snappy::RawCompress(src, s, dst, &d);
+  } else {
+    if (!snappy::GetUncompressedLength(src, s, &d)) {
+      fprintf(stderr, "could not get uncompressed length\n");
+      return 1;
+    }
+    if (N < d) {
+      fprintf(stderr, "input too large after decoding\n");
+      return 1;
+    }
+    if (!snappy::RawUncompress(src, s, dst)) {
+      fprintf(stderr, "input was not valid Snappy-compressed data\n");
+      return 1;
+    }
+  }
+  write(1, dst, d);
+  return 0;
+}
--- a/vendor/github.com/klauspost/compress/snappy/decode.go
+++ b/vendor/github.com/klauspost/compress/snappy/decode.go
@@ -0,0 +1,237 @@
+// Copyright 2011 The Snappy-Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package snappy
+
+import (
+	"encoding/binary"
+	"errors"
+	"io"
+)
+
+var (
+	// ErrCorrupt reports that the input is invalid.
+	ErrCorrupt = errors.New("snappy: corrupt input")
+	// ErrTooLarge reports that the uncompressed length is too large.
+	ErrTooLarge = errors.New("snappy: decoded block is too large")
+	// ErrUnsupported reports that the input isn't supported.
+	ErrUnsupported = errors.New("snappy: unsupported input")
+
+	errUnsupportedLiteralLength = errors.New("snappy: unsupported literal length")
+)
+
+// DecodedLen returns the length of the decoded block.
+func DecodedLen(src []byte) (int, error) {
+	v, _, err := decodedLen(src)
+	return v, err
+}
+
+// decodedLen returns the length of the decoded block and the number of bytes
+// that the length header occupied.
+func decodedLen(src []byte) (blockLen, headerLen int, err error) {
+	v, n := binary.Uvarint(src)
+	if n <= 0 || v > 0xffffffff {
+		return 0, 0, ErrCorrupt
+	}
+
+	const wordSize = 32 << (^uint(0) >> 32 & 1)
+	if wordSize == 32 && v > 0x7fffffff {
+		return 0, 0, ErrTooLarge
+	}
+	return int(v), n, nil
+}
+
+const (
+	decodeErrCodeCorrupt                  = 1
+	decodeErrCodeUnsupportedLiteralLength = 2
+)
+
+// Decode returns the decoded form of src. The returned slice may be a sub-
+// slice of dst if dst was large enough to hold the entire decoded block.
+// Otherwise, a newly allocated slice will be returned.
+//
+// The dst and src must not overlap. It is valid to pass a nil dst.
+func Decode(dst, src []byte) ([]byte, error) {
+	dLen, s, err := decodedLen(src)
+	if err != nil {
+		return nil, err
+	}
+	if dLen <= len(dst) {
+		dst = dst[:dLen]
+	} else {
+		dst = make([]byte, dLen)
+	}
+	switch decode(dst, src[s:]) {
+	case 0:
+		return dst, nil
+	case decodeErrCodeUnsupportedLiteralLength:
+		return nil, errUnsupportedLiteralLength
+	}
+	return nil, ErrCorrupt
+}
+
+// NewReader returns a new Reader that decompresses from r, using the framing
+// format described at
+// https://github.com/google/snappy/blob/master/framing_format.txt
+func NewReader(r io.Reader) *Reader {
+	return &Reader{
+		r:       r,
+		decoded: make([]byte, maxBlockSize),
+		buf:     make([]byte, maxEncodedLenOfMaxBlockSize+checksumSize),
+	}
+}
+
+// Reader is an io.Reader that can read Snappy-compressed bytes.
+type Reader struct {
+	r       io.Reader
+	err     error
+	decoded []byte
+	buf     []byte
+	// decoded[i:j] contains decoded bytes that have not yet been passed on.
+	i, j       int
+	readHeader bool
+}
+
+// Reset discards any buffered data, resets all state, and switches the Snappy
+// reader to read from r. This permits reusing a Reader rather than allocating
+// a new one.
+func (r *Reader) Reset(reader io.Reader) {
+	r.r = reader
+	r.err = nil
+	r.i = 0
+	r.j = 0
+	r.readHeader = false
+}
+
+func (r *Reader) readFull(p []byte, allowEOF bool) (ok bool) {
+	if _, r.err = io.ReadFull(r.r, p); r.err != nil {
+		if r.err == io.ErrUnexpectedEOF || (r.err == io.EOF && !allowEOF) {
+			r.err = ErrCorrupt
+		}
+		return false
+	}
+	return true
+}
+
+// Read satisfies the io.Reader interface.
+func (r *Reader) Read(p []byte) (int, error) {
+	if r.err != nil {
+		return 0, r.err
+	}
+	for {
+		if r.i < r.j {
+			n := copy(p, r.decoded[r.i:r.j])
+			r.i += n
+			return n, nil
+		}
+		if !r.readFull(r.buf[:4], true) {
+			return 0, r.err
+		}
+		chunkType := r.buf[0]
+		if !r.readHeader {
+			if chunkType != chunkTypeStreamIdentifier {
+				r.err = ErrCorrupt
+				return 0, r.err
+			}
+			r.readHeader = true
+		}
+		chunkLen := int(r.buf[1]) | int(r.buf[2])<<8 | int(r.buf[3])<<16
+		if chunkLen > len(r.buf) {
+			r.err = ErrUnsupported
+			return 0, r.err
+		}
+
+		// The chunk types are specified at
+		// https://github.com/google/snappy/blob/master/framing_format.txt
+		switch chunkType {
+		case chunkTypeCompressedData:
+			// Section 4.2. Compressed data (chunk type 0x00).
+			if chunkLen < checksumSize {
+				r.err = ErrCorrupt
+				return 0, r.err
+			}
+			buf := r.buf[:chunkLen]
+			if !r.readFull(buf, false) {
+				return 0, r.err
+			}
+			checksum := uint32(buf[0]) | uint32(buf[1])<<8 | uint32(buf[2])<<16 | uint32(buf[3])<<24
+			buf = buf[checksumSize:]
+
+			n, err := DecodedLen(buf)
+			if err != nil {
+				r.err = err
+				return 0, r.err
+			}
+			if n > len(r.decoded) {
+				r.err = ErrCorrupt
+				return 0, r.err
+			}
+			if _, err := Decode(r.decoded, buf); err != nil {
+				r.err = err
+				return 0, r.err
+			}
+			if crc(r.decoded[:n]) != checksum {
+				r.err = ErrCorrupt
+				return 0, r.err
+			}
+			r.i, r.j = 0, n
+			continue
+
+		case chunkTypeUncompressedData:
+			// Section 4.3. Uncompressed data (chunk type 0x01).
+			if chunkLen < checksumSize {
+				r.err = ErrCorrupt
+				return 0, r.err
+			}
+			buf := r.buf[:checksumSize]
+			if !r.readFull(buf, false) {
+				return 0, r.err
+			}
+			checksum := uint32(buf[0]) | uint32(buf[1])<<8 | uint32(buf[2])<<16 | uint32(buf[3])<<24
+			// Read directly into r.decoded instead of via r.buf.
+			n := chunkLen - checksumSize
+			if n > len(r.decoded) {
+				r.err = ErrCorrupt
+				return 0, r.err
+			}
+			if !r.readFull(r.decoded[:n], false) {
+				return 0, r.err
+			}
+			if crc(r.decoded[:n]) != checksum {
+				r.err = ErrCorrupt
+				return 0, r.err
+			}
+			r.i, r.j = 0, n
+			continue
+
+		case chunkTypeStreamIdentifier:
+			// Section 4.1. Stream identifier (chunk type 0xff).
+			if chunkLen != len(magicBody) {
+				r.err = ErrCorrupt
+				return 0, r.err
+			}
+			if !r.readFull(r.buf[:len(magicBody)], false) {
+				return 0, r.err
+			}
+			for i := 0; i < len(magicBody); i++ {
+				if r.buf[i] != magicBody[i] {
+					r.err = ErrCorrupt
+					return 0, r.err
+				}
+			}
+			continue
+		}
+
+		if chunkType <= 0x7f {
+			// Section 4.5. Reserved unskippable chunks (chunk types 0x02-0x7f).
+			r.err = ErrUnsupported
+			return 0, r.err
+		}
+		// Section 4.4 Padding (chunk type 0xfe).
+		// Section 4.6. Reserved skippable chunks (chunk types 0x80-0xfd).
+		if !r.readFull(r.buf[:chunkLen], false) {
+			return 0, r.err
+		}
+	}
+}
--- a/vendor/github.com/klauspost/compress/snappy/decode_amd64.go
+++ b/vendor/github.com/klauspost/compress/snappy/decode_amd64.go
@@ -0,0 +1,14 @@
+// Copyright 2016 The Snappy-Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build !appengine
+// +build gc
+// +build !noasm
+
+package snappy
+
+// decode has the same semantics as in decode_other.go.
+//
+//go:noescape
+func decode(dst, src []byte) int
--- a/vendor/github.com/klauspost/compress/snappy/decode_amd64.s
+++ b/vendor/github.com/klauspost/compress/snappy/decode_amd64.s
@@ -0,0 +1,490 @@
+// Copyright 2016 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build !appengine
+// +build gc
+// +build !noasm
+
+#include "textflag.h"
+
+// The asm code generally follows the pure Go code in decode_other.go, except
+// where marked with a "!!!".
+
+// func decode(dst, src []byte) int
+//
+// All local variables fit into registers. The non-zero stack size is only to
+// spill registers and push args when issuing a CALL. The register allocation:
+//	- AX	scratch
+//	- BX	scratch
+//	- CX	length or x
+//	- DX	offset
+//	- SI	&src[s]
+//	- DI	&dst[d]
+//	+ R8	dst_base
+//	+ R9	dst_len
+//	+ R10	dst_base + dst_len
+//	+ R11	src_base
+//	+ R12	src_len
+//	+ R13	src_base + src_len
+//	- R14	used by doCopy
+//	- R15	used by doCopy
+//
+// The registers R8-R13 (marked with a "+") are set at the start of the
+// function, and after a CALL returns, and are not otherwise modified.
+//
+// The d variable is implicitly DI - R8,  and len(dst)-d is R10 - DI.
+// The s variable is implicitly SI - R11, and len(src)-s is R13 - SI.
+TEXT ·decode(SB), NOSPLIT, $48-56
+	// Initialize SI, DI and R8-R13.
+	MOVQ dst_base+0(FP), R8
+	MOVQ dst_len+8(FP), R9
+	MOVQ R8, DI
+	MOVQ R8, R10
+	ADDQ R9, R10
+	MOVQ src_base+24(FP), R11
+	MOVQ src_len+32(FP), R12
+	MOVQ R11, SI
+	MOVQ R11, R13
+	ADDQ R12, R13
+
+loop:
+	// for s < len(src)
+	CMPQ SI, R13
+	JEQ  end
+
+	// CX = uint32(src[s])
+	//
+	// switch src[s] & 0x03
+	MOVBLZX (SI), CX
+	MOVL    CX, BX
+	ANDL    $3, BX
+	CMPL    BX, $1
+	JAE     tagCopy
+
+	// ----------------------------------------
+	// The code below handles literal tags.
+
+	// case tagLiteral:
+	// x := uint32(src[s] >> 2)
+	// switch
+	SHRL $2, CX
+	CMPL CX, $60
+	JAE  tagLit60Plus
+
+	// case x < 60:
+	// s++
+	INCQ SI
+
+doLit:
+	// This is the end of the inner "switch", when we have a literal tag.
+	//
+	// We assume that CX == x and x fits in a uint32, where x is the variable
+	// used in the pure Go decode_other.go code.
+
+	// length = int(x) + 1
+	//
+	// Unlike the pure Go code, we don't need to check if length <= 0 because
+	// CX can hold 64 bits, so the increment cannot overflow.
+	INCQ CX
+
+	// Prepare to check if copying length bytes will run past the end of dst or
+	// src.
+	//
+	// AX = len(dst) - d
+	// BX = len(src) - s
+	MOVQ R10, AX
+	SUBQ DI, AX
+	MOVQ R13, BX
+	SUBQ SI, BX
+
+	// !!! Try a faster technique for short (16 or fewer bytes) copies.
+	//
+	// if length > 16 || len(dst)-d < 16 || len(src)-s < 16 {
+	//   goto callMemmove // Fall back on calling runtime·memmove.
+	// }
+	//
+	// The C++ snappy code calls this TryFastAppend. It also checks len(src)-s
+	// against 21 instead of 16, because it cannot assume that all of its input
+	// is contiguous in memory and so it needs to leave enough source bytes to
+	// read the next tag without refilling buffers, but Go's Decode assumes
+	// contiguousness (the src argument is a []byte).
+	CMPQ CX, $16
+	JGT  callMemmove
+	CMPQ AX, $16
+	JLT  callMemmove
+	CMPQ BX, $16
+	JLT  callMemmove
+
+	// !!! Implement the copy from src to dst as a 16-byte load and store.
+	// (Decode's documentation says that dst and src must not overlap.)
+	//
+	// This always copies 16 bytes, instead of only length bytes, but that's
+	// OK. If the input is a valid Snappy encoding then subsequent iterations
+	// will fix up the overrun. Otherwise, Decode returns a nil []byte (and a
+	// non-nil error), so the overrun will be ignored.
+	//
+	// Note that on amd64, it is legal and cheap to issue unaligned 8-byte or
+	// 16-byte loads and stores. This technique probably wouldn't be as
+	// effective on architectures that are fussier about alignment.
+	MOVOU 0(SI), X0
+	MOVOU X0, 0(DI)
+
+	// d += length
+	// s += length
+	ADDQ CX, DI
+	ADDQ CX, SI
+	JMP  loop
+
+callMemmove:
+	// if length > len(dst)-d || length > len(src)-s { etc }
+	CMPQ CX, AX
+	JGT  errCorrupt
+	CMPQ CX, BX
+	JGT  errCorrupt
+
+	// copy(dst[d:], src[s:s+length])
+	//
+	// This means calling runtime·memmove(&dst[d], &src[s], length), so we push
+	// DI, SI and CX as arguments. Coincidentally, we also need to spill those
+	// three registers to the stack, to save local variables across the CALL.
+	MOVQ DI, 0(SP)
+	MOVQ SI, 8(SP)
+	MOVQ CX, 16(SP)
+	MOVQ DI, 24(SP)
+	MOVQ SI, 32(SP)
+	MOVQ CX, 40(SP)
+	CALL runtime·memmove(SB)
+
+	// Restore local variables: unspill registers from the stack and
+	// re-calculate R8-R13.
+	MOVQ 24(SP), DI
+	MOVQ 32(SP), SI
+	MOVQ 40(SP), CX
+	MOVQ dst_base+0(FP), R8
+	MOVQ dst_len+8(FP), R9
+	MOVQ R8, R10
+	ADDQ R9, R10
+	MOVQ src_base+24(FP), R11
+	MOVQ src_len+32(FP), R12
+	MOVQ R11, R13
+	ADDQ R12, R13
+
+	// d += length
+	// s += length
+	ADDQ CX, DI
+	ADDQ CX, SI
+	JMP  loop
+
+tagLit60Plus:
+	// !!! This fragment does the
+	//
+	// s += x - 58; if uint(s) > uint(len(src)) { etc }
+	//
+	// checks. In the asm version, we code it once instead of once per switch case.
+	ADDQ CX, SI
+	SUBQ $58, SI
+	MOVQ SI, BX
+	SUBQ R11, BX
+	CMPQ BX, R12
+	JA   errCorrupt
+
+	// case x == 60:
+	CMPL CX, $61
+	JEQ  tagLit61
+	JA   tagLit62Plus
+
+	// x = uint32(src[s-1])
+	MOVBLZX -1(SI), CX
+	JMP     doLit
+
+tagLit61:
+	// case x == 61:
+	// x = uint32(src[s-2]) | uint32(src[s-1])<<8
+	MOVWLZX -2(SI), CX
+	JMP     doLit
+
+tagLit62Plus:
+	CMPL CX, $62
+	JA   tagLit63
+
+	// case x == 62:
+	// x = uint32(src[s-3]) | uint32(src[s-2])<<8 | uint32(src[s-1])<<16
+	MOVWLZX -3(SI), CX
+	MOVBLZX -1(SI), BX
+	SHLL    $16, BX
+	ORL     BX, CX
+	JMP     doLit
+
+tagLit63:
+	// case x == 63:
+	// x = uint32(src[s-4]) | uint32(src[s-3])<<8 | uint32(src[s-2])<<16 | uint32(src[s-1])<<24
+	MOVL -4(SI), CX
+	JMP  doLit
+
+// The code above handles literal tags.
+// ----------------------------------------
+// The code below handles copy tags.
+
+tagCopy4:
+	// case tagCopy4:
+	// s += 5
+	ADDQ $5, SI
+
+	// if uint(s) > uint(len(src)) { etc }
+	MOVQ SI, BX
+	SUBQ R11, BX
+	CMPQ BX, R12
+	JA   errCorrupt
+
+	// length = 1 + int(src[s-5])>>2
+	SHRQ $2, CX
+	INCQ CX
+
+	// offset = int(uint32(src[s-4]) | uint32(src[s-3])<<8 | uint32(src[s-2])<<16 | uint32(src[s-1])<<24)
+	MOVLQZX -4(SI), DX
+	JMP     doCopy
+
+tagCopy2:
+	// case tagCopy2:
+	// s += 3
+	ADDQ $3, SI
+
+	// if uint(s) > uint(len(src)) { etc }
+	MOVQ SI, BX
+	SUBQ R11, BX
+	CMPQ BX, R12
+	JA   errCorrupt
+
+	// length = 1 + int(src[s-3])>>2
+	SHRQ $2, CX
+	INCQ CX
+
+	// offset = int(uint32(src[s-2]) | uint32(src[s-1])<<8)
+	MOVWQZX -2(SI), DX
+	JMP     doCopy
+
+tagCopy:
+	// We have a copy tag. We assume that:
+	//	- BX == src[s] & 0x03
+	//	- CX == src[s]
+	CMPQ BX, $2
+	JEQ  tagCopy2
+	JA   tagCopy4
+
+	// case tagCopy1:
+	// s += 2
+	ADDQ $2, SI
+
+	// if uint(s) > uint(len(src)) { etc }
+	MOVQ SI, BX
+	SUBQ R11, BX
+	CMPQ BX, R12
+	JA   errCorrupt
+
+	// offset = int(uint32(src[s-2])&0xe0<<3 | uint32(src[s-1]))
+	MOVQ    CX, DX
+	ANDQ    $0xe0, DX
+	SHLQ    $3, DX
+	MOVBQZX -1(SI), BX
+	ORQ     BX, DX
+
+	// length = 4 + int(src[s-2])>>2&0x7
+	SHRQ $2, CX
+	ANDQ $7, CX
+	ADDQ $4, CX
+
+doCopy:
+	// This is the end of the outer "switch", when we have a copy tag.
+	//
+	// We assume that:
+	//	- CX == length && CX > 0
+	//	- DX == offset
+
+	// if offset <= 0 { etc }
+	CMPQ DX, $0
+	JLE  errCorrupt
+
+	// if d < offset { etc }
+	MOVQ DI, BX
+	SUBQ R8, BX
+	CMPQ BX, DX
+	JLT  errCorrupt
+
+	// if length > len(dst)-d { etc }
+	MOVQ R10, BX
+	SUBQ DI, BX
+	CMPQ CX, BX
+	JGT  errCorrupt
+
+	// forwardCopy(dst[d:d+length], dst[d-offset:]); d += length
+	//
+	// Set:
+	//	- R14 = len(dst)-d
+	//	- R15 = &dst[d-offset]
+	MOVQ R10, R14
+	SUBQ DI, R14
+	MOVQ DI, R15
+	SUBQ DX, R15
+
+	// !!! Try a faster technique for short (16 or fewer bytes) forward copies.
+	//
+	// First, try using two 8-byte load/stores, similar to the doLit technique
+	// above. Even if dst[d:d+length] and dst[d-offset:] can overlap, this is
+	// still OK if offset >= 8. Note that this has to be two 8-byte load/stores
+	// and not one 16-byte load/store, and the first store has to be before the
+	// second load, due to the overlap if offset is in the range [8, 16).
+	//
+	// if length > 16 || offset < 8 || len(dst)-d < 16 {
+	//   goto slowForwardCopy
+	// }
+	// copy 16 bytes
+	// d += length
+	CMPQ CX, $16
+	JGT  slowForwardCopy
+	CMPQ DX, $8
+	JLT  slowForwardCopy
+	CMPQ R14, $16
+	JLT  slowForwardCopy
+	MOVQ 0(R15), AX
+	MOVQ AX, 0(DI)
+	MOVQ 8(R15), BX
+	MOVQ BX, 8(DI)
+	ADDQ CX, DI
+	JMP  loop
+
+slowForwardCopy:
+	// !!! If the forward copy is longer than 16 bytes, or if offset < 8, we
+	// can still try 8-byte load stores, provided we can overrun up to 10 extra
+	// bytes. As above, the overrun will be fixed up by subsequent iterations
+	// of the outermost loop.
+	//
+	// The C++ snappy code calls this technique IncrementalCopyFastPath. Its
+	// commentary says:
+	//
+	// ----
+	//
+	// The main part of this loop is a simple copy of eight bytes at a time
+	// until we've copied (at least) the requested amount of bytes.  However,
+	// if d and d-offset are less than eight bytes apart (indicating a
+	// repeating pattern of length < 8), we first need to expand the pattern in
+	// order to get the correct results. For instance, if the buffer looks like
+	// this, with the eight-byte <d-offset> and <d> patterns marked as
+	// intervals:
+	//
+	//    abxxxxxxxxxxxx
+	//    [------]           d-offset
+	//      [------]         d
+	//
+	// a single eight-byte copy from <d-offset> to <d> will repeat the pattern
+	// once, after which we can move <d> two bytes without moving <d-offset>:
+	//
+	//    ababxxxxxxxxxx
+	//    [------]           d-offset
+	//        [------]       d
+	//
+	// and repeat the exercise until the two no longer overlap.
+	//
+	// This allows us to do very well in the special case of one single byte
+	// repeated many times, without taking a big hit for more general cases.
+	//
+	// The worst case of extra writing past the end of the match occurs when
+	// offset == 1 and length == 1; the last copy will read from byte positions
+	// [0..7] and write to [4..11], whereas it was only supposed to write to
+	// position 1. Thus, ten excess bytes.
+	//
+	// ----
+	//
+	// That "10 byte overrun" worst case is confirmed by Go's
+	// TestSlowForwardCopyOverrun, which also tests the fixUpSlowForwardCopy
+	// and finishSlowForwardCopy algorithm.
+	//
+	// if length > len(dst)-d-10 {
+	//   goto verySlowForwardCopy
+	// }
+	SUBQ $10, R14
+	CMPQ CX, R14
+	JGT  verySlowForwardCopy
+
+makeOffsetAtLeast8:
+	// !!! As above, expand the pattern so that offset >= 8 and we can use
+	// 8-byte load/stores.
+	//
+	// for offset < 8 {
+	//   copy 8 bytes from dst[d-offset:] to dst[d:]
+	//   length -= offset
+	//   d      += offset
+	//   offset += offset
+	//   // The two previous lines together means that d-offset, and therefore
+	//   // R15, is unchanged.
+	// }
+	CMPQ DX, $8
+	JGE  fixUpSlowForwardCopy
+	MOVQ (R15), BX
+	MOVQ BX, (DI)
+	SUBQ DX, CX
+	ADDQ DX, DI
+	ADDQ DX, DX
+	JMP  makeOffsetAtLeast8
+
+fixUpSlowForwardCopy:
+	// !!! Add length (which might be negative now) to d (implied by DI being
+	// &dst[d]) so that d ends up at the right place when we jump back to the
+	// top of the loop. Before we do that, though, we save DI to AX so that, if
+	// length is positive, copying the remaining length bytes will write to the
+	// right place.
+	MOVQ DI, AX
+	ADDQ CX, DI
+
+finishSlowForwardCopy:
+	// !!! Repeat 8-byte load/stores until length <= 0. Ending with a negative
+	// length means that we overrun, but as above, that will be fixed up by
+	// subsequent iterations of the outermost loop.
+	CMPQ CX, $0
+	JLE  loop
+	MOVQ (R15), BX
+	MOVQ BX, (AX)
+	ADDQ $8, R15
+	ADDQ $8, AX
+	SUBQ $8, CX
+	JMP  finishSlowForwardCopy
+
+verySlowForwardCopy:
+	// verySlowForwardCopy is a simple implementation of forward copy. In C
+	// parlance, this is a do/while loop instead of a while loop, since we know
+	// that length > 0. In Go syntax:
+	//
+	// for {
+	//   dst[d] = dst[d - offset]
+	//   d++
+	//   length--
+	//   if length == 0 {
+	//     break
+	//   }
+	// }
+	MOVB (R15), BX
+	MOVB BX, (DI)
+	INCQ R15
+	INCQ DI
+	DECQ CX
+	JNZ  verySlowForwardCopy
+	JMP  loop
+
+// The code above handles copy tags.
+// ----------------------------------------
+
+end:
+	// This is the end of the "for s < len(src)".
+	//
+	// if d != len(dst) { etc }
+	CMPQ DI, R10
+	JNE  errCorrupt
+
+	// return 0
+	MOVQ $0, ret+48(FP)
+	RET
+
+errCorrupt:
+	// return decodeErrCodeCorrupt
+	MOVQ $1, ret+48(FP)
+	RET
--- a/vendor/github.com/klauspost/compress/snappy/decode_other.go
+++ b/vendor/github.com/klauspost/compress/snappy/decode_other.go
@@ -0,0 +1,101 @@
+// Copyright 2016 The Snappy-Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build !amd64 appengine !gc noasm
+
+package snappy
+
+// decode writes the decoding of src to dst. It assumes that the varint-encoded
+// length of the decompressed bytes has already been read, and that len(dst)
+// equals that length.
+//
+// It returns 0 on success or a decodeErrCodeXxx error code on failure.
+func decode(dst, src []byte) int {
+	var d, s, offset, length int
+	for s < len(src) {
+		switch src[s] & 0x03 {
+		case tagLiteral:
+			x := uint32(src[s] >> 2)
+			switch {
+			case x < 60:
+				s++
+			case x == 60:
+				s += 2
+				if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
+					return decodeErrCodeCorrupt
+				}
+				x = uint32(src[s-1])
+			case x == 61:
+				s += 3
+				if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
+					return decodeErrCodeCorrupt
+				}
+				x = uint32(src[s-2]) | uint32(src[s-1])<<8
+			case x == 62:
+				s += 4
+				if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
+					return decodeErrCodeCorrupt
+				}
+				x = uint32(src[s-3]) | uint32(src[s-2])<<8 | uint32(src[s-1])<<16
+			case x == 63:
+				s += 5
+				if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
+					return decodeErrCodeCorrupt
+				}
+				x = uint32(src[s-4]) | uint32(src[s-3])<<8 | uint32(src[s-2])<<16 | uint32(src[s-1])<<24
+			}
+			length = int(x) + 1
+			if length <= 0 {
+				return decodeErrCodeUnsupportedLiteralLength
+			}
+			if length > len(dst)-d || length > len(src)-s {
+				return decodeErrCodeCorrupt
+			}
+			copy(dst[d:], src[s:s+length])
+			d += length
+			s += length
+			continue
+
+		case tagCopy1:
+			s += 2
+			if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
+				return decodeErrCodeCorrupt
+			}
+			length = 4 + int(src[s-2])>>2&0x7
+			offset = int(uint32(src[s-2])&0xe0<<3 | uint32(src[s-1]))
+
+		case tagCopy2:
+			s += 3
+			if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
+				return decodeErrCodeCorrupt
+			}
+			length = 1 + int(src[s-3])>>2
+			offset = int(uint32(src[s-2]) | uint32(src[s-1])<<8)
+
+		case tagCopy4:
+			s += 5
+			if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
+				return decodeErrCodeCorrupt
+			}
+			length = 1 + int(src[s-5])>>2
+			offset = int(uint32(src[s-4]) | uint32(src[s-3])<<8 | uint32(src[s-2])<<16 | uint32(src[s-1])<<24)
+		}
+
+		if offset <= 0 || d < offset || length > len(dst)-d {
+			return decodeErrCodeCorrupt
+		}
+		// Copy from an earlier sub-slice of dst to a later sub-slice. Unlike
+		// the built-in copy function, this byte-by-byte copy always runs
+		// forwards, even if the slices overlap. Conceptually, this is:
+		//
+		// d += forwardCopy(dst[d:d+length], dst[d-offset:])
+		for end := d + length; d != end; d++ {
+			dst[d] = dst[d-offset]
+		}
+	}
+	if d != len(dst) {
+		return decodeErrCodeCorrupt
+	}
+	return 0
+}
--- a/vendor/github.com/klauspost/compress/snappy/encode.go
+++ b/vendor/github.com/klauspost/compress/snappy/encode.go
@@ -0,0 +1,285 @@
+// Copyright 2011 The Snappy-Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package snappy
+
+import (
+	"encoding/binary"
+	"errors"
+	"io"
+)
+
+// Encode returns the encoded form of src. The returned slice may be a sub-
+// slice of dst if dst was large enough to hold the entire encoded block.
+// Otherwise, a newly allocated slice will be returned.
+//
+// The dst and src must not overlap. It is valid to pass a nil dst.
+func Encode(dst, src []byte) []byte {
+	if n := MaxEncodedLen(len(src)); n < 0 {
+		panic(ErrTooLarge)
+	} else if len(dst) < n {
+		dst = make([]byte, n)
+	}
+
+	// The block starts with the varint-encoded length of the decompressed bytes.
+	d := binary.PutUvarint(dst, uint64(len(src)))
+
+	for len(src) > 0 {
+		p := src
+		src = nil
+		if len(p) > maxBlockSize {
+			p, src = p[:maxBlockSize], p[maxBlockSize:]
+		}
+		if len(p) < minNonLiteralBlockSize {
+			d += emitLiteral(dst[d:], p)
+		} else {
+			d += encodeBlock(dst[d:], p)
+		}
+	}
+	return dst[:d]
+}
+
+// inputMargin is the minimum number of extra input bytes to keep, inside
+// encodeBlock's inner loop. On some architectures, this margin lets us
+// implement a fast path for emitLiteral, where the copy of short (<= 16 byte)
+// literals can be implemented as a single load to and store from a 16-byte
+// register. That literal's actual length can be as short as 1 byte, so this
+// can copy up to 15 bytes too much, but that's OK as subsequent iterations of
+// the encoding loop will fix up the copy overrun, and this inputMargin ensures
+// that we don't overrun the dst and src buffers.
+const inputMargin = 16 - 1
+
+// minNonLiteralBlockSize is the minimum size of the input to encodeBlock that
+// could be encoded with a copy tag. This is the minimum with respect to the
+// algorithm used by encodeBlock, not a minimum enforced by the file format.
+//
+// The encoded output must start with at least a 1 byte literal, as there are
+// no previous bytes to copy. A minimal (1 byte) copy after that, generated
+// from an emitCopy call in encodeBlock's main loop, would require at least
+// another inputMargin bytes, for the reason above: we want any emitLiteral
+// calls inside encodeBlock's main loop to use the fast path if possible, which
+// requires being able to overrun by inputMargin bytes. Thus,
+// minNonLiteralBlockSize equals 1 + 1 + inputMargin.
+//
+// The C++ code doesn't use this exact threshold, but it could, as discussed at
+// https://groups.google.com/d/topic/snappy-compression/oGbhsdIJSJ8/discussion
+// The difference between Go (2+inputMargin) and C++ (inputMargin) is purely an
+// optimization. It should not affect the encoded form. This is tested by
+// TestSameEncodingAsCppShortCopies.
+const minNonLiteralBlockSize = 1 + 1 + inputMargin
+
+// MaxEncodedLen returns the maximum length of a snappy block, given its
+// uncompressed length.
+//
+// It will return a negative value if srcLen is too large to encode.
+func MaxEncodedLen(srcLen int) int {
+	n := uint64(srcLen)
+	if n > 0xffffffff {
+		return -1
+	}
+	// Compressed data can be defined as:
+	//    compressed := item* literal*
+	//    item       := literal* copy
+	//
+	// The trailing literal sequence has a space blowup of at most 62/60
+	// since a literal of length 60 needs one tag byte + one extra byte
+	// for length information.
+	//
+	// Item blowup is trickier to measure. Suppose the "copy" op copies
+	// 4 bytes of data. Because of a special check in the encoding code,
+	// we produce a 4-byte copy only if the offset is < 65536. Therefore
+	// the copy op takes 3 bytes to encode, and this type of item leads
+	// to at most the 62/60 blowup for representing literals.
+	//
+	// Suppose the "copy" op copies 5 bytes of data. If the offset is big
+	// enough, it will take 5 bytes to encode the copy op. Therefore the
+	// worst case here is a one-byte literal followed by a five-byte copy.
+	// That is, 6 bytes of input turn into 7 bytes of "compressed" data.
+	//
+	// This last factor dominates the blowup, so the final estimate is:
+	n = 32 + n + n/6
+	if n > 0xffffffff {
+		return -1
+	}
+	return int(n)
+}
+
+var errClosed = errors.New("snappy: Writer is closed")
+
+// NewWriter returns a new Writer that compresses to w.
+//
+// The Writer returned does not buffer writes. There is no need to Flush or
+// Close such a Writer.
+//
+// Deprecated: the Writer returned is not suitable for many small writes, only
+// for few large writes. Use NewBufferedWriter instead, which is efficient
+// regardless of the frequency and shape of the writes, and remember to Close
+// that Writer when done.
+func NewWriter(w io.Writer) *Writer {
+	return &Writer{
+		w:    w,
+		obuf: make([]byte, obufLen),
+	}
+}
+
+// NewBufferedWriter returns a new Writer that compresses to w, using the
+// framing format described at
+// https://github.com/google/snappy/blob/master/framing_format.txt
+//
+// The Writer returned buffers writes. Users must call Close to guarantee all
+// data has been forwarded to the underlying io.Writer. They may also call
+// Flush zero or more times before calling Close.
+func NewBufferedWriter(w io.Writer) *Writer {
+	return &Writer{
+		w:    w,
+		ibuf: make([]byte, 0, maxBlockSize),
+		obuf: make([]byte, obufLen),
+	}
+}
+
+// Writer is an io.Writer than can write Snappy-compressed bytes.
+type Writer struct {
+	w   io.Writer
+	err error
+
+	// ibuf is a buffer for the incoming (uncompressed) bytes.
+	//
+	// Its use is optional. For backwards compatibility, Writers created by the
+	// NewWriter function have ibuf == nil, do not buffer incoming bytes, and
+	// therefore do not need to be Flush'ed or Close'd.
+	ibuf []byte
+
+	// obuf is a buffer for the outgoing (compressed) bytes.
+	obuf []byte
+
+	// wroteStreamHeader is whether we have written the stream header.
+	wroteStreamHeader bool
+}
+
+// Reset discards the writer's state and switches the Snappy writer to write to
+// w. This permits reusing a Writer rather than allocating a new one.
+func (w *Writer) Reset(writer io.Writer) {
+	w.w = writer
+	w.err = nil
+	if w.ibuf != nil {
+		w.ibuf = w.ibuf[:0]
+	}
+	w.wroteStreamHeader = false
+}
+
+// Write satisfies the io.Writer interface.
+func (w *Writer) Write(p []byte) (nRet int, errRet error) {
+	if w.ibuf == nil {
+		// Do not buffer incoming bytes. This does not perform or compress well
+		// if the caller of Writer.Write writes many small slices. This
+		// behavior is therefore deprecated, but still supported for backwards
+		// compatibility with code that doesn't explicitly Flush or Close.
+		return w.write(p)
+	}
+
+	// The remainder of this method is based on bufio.Writer.Write from the
+	// standard library.
+
+	for len(p) > (cap(w.ibuf)-len(w.ibuf)) && w.err == nil {
+		var n int
+		if len(w.ibuf) == 0 {
+			// Large write, empty buffer.
+			// Write directly from p to avoid copy.
+			n, _ = w.write(p)
+		} else {
+			n = copy(w.ibuf[len(w.ibuf):cap(w.ibuf)], p)
+			w.ibuf = w.ibuf[:len(w.ibuf)+n]
+			w.Flush()
+		}
+		nRet += n
+		p = p[n:]
+	}
+	if w.err != nil {
+		return nRet, w.err
+	}
+	n := copy(w.ibuf[len(w.ibuf):cap(w.ibuf)], p)
+	w.ibuf = w.ibuf[:len(w.ibuf)+n]
+	nRet += n
+	return nRet, nil
+}
+
+func (w *Writer) write(p []byte) (nRet int, errRet error) {
+	if w.err != nil {
+		return 0, w.err
+	}
+	for len(p) > 0 {
+		obufStart := len(magicChunk)
+		if !w.wroteStreamHeader {
+			w.wroteStreamHeader = true
+			copy(w.obuf, magicChunk)
+			obufStart = 0
+		}
+
+		var uncompressed []byte
+		if len(p) > maxBlockSize {
+			uncompressed, p = p[:maxBlockSize], p[maxBlockSize:]
+		} else {
+			uncompressed, p = p, nil
+		}
+		checksum := crc(uncompressed)
+
+		// Compress the buffer, discarding the result if the improvement
+		// isn't at least 12.5%.
+		compressed := Encode(w.obuf[obufHeaderLen:], uncompressed)
+		chunkType := uint8(chunkTypeCompressedData)
+		chunkLen := 4 + len(compressed)
+		obufEnd := obufHeaderLen + len(compressed)
+		if len(compressed) >= len(uncompressed)-len(uncompressed)/8 {
+			chunkType = chunkTypeUncompressedData
+			chunkLen = 4 + len(uncompressed)
+			obufEnd = obufHeaderLen
+		}
+
+		// Fill in the per-chunk header that comes before the body.
+		w.obuf[len(magicChunk)+0] = chunkType
+		w.obuf[len(magicChunk)+1] = uint8(chunkLen >> 0)
+		w.obuf[len(magicChunk)+2] = uint8(chunkLen >> 8)
+		w.obuf[len(magicChunk)+3] = uint8(chunkLen >> 16)
+		w.obuf[len(magicChunk)+4] = uint8(checksum >> 0)
+		w.obuf[len(magicChunk)+5] = uint8(checksum >> 8)
+		w.obuf[len(magicChunk)+6] = uint8(checksum >> 16)
+		w.obuf[len(magicChunk)+7] = uint8(checksum >> 24)
+
+		if _, err := w.w.Write(w.obuf[obufStart:obufEnd]); err != nil {
+			w.err = err
+			return nRet, err
+		}
+		if chunkType == chunkTypeUncompressedData {
+			if _, err := w.w.Write(uncompressed); err != nil {
+				w.err = err
+				return nRet, err
+			}
+		}
+		nRet += len(uncompressed)
+	}
+	return nRet, nil
+}
+
+// Flush flushes the Writer to its underlying io.Writer.
+func (w *Writer) Flush() error {
+	if w.err != nil {
+		return w.err
+	}
+	if len(w.ibuf) == 0 {
+		return nil
+	}
+	w.write(w.ibuf)
+	w.ibuf = w.ibuf[:0]
+	return w.err
+}
+
+// Close calls Flush and then closes the Writer.
+func (w *Writer) Close() error {
+	w.Flush()
+	ret := w.err
+	if w.err == nil {
+		w.err = errClosed
+	}
+	return ret
+}
--- a/vendor/github.com/klauspost/compress/snappy/encode_amd64.go
+++ b/vendor/github.com/klauspost/compress/snappy/encode_amd64.go
@@ -0,0 +1,29 @@
+// Copyright 2016 The Snappy-Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build !appengine
+// +build gc
+// +build !noasm
+
+package snappy
+
+// emitLiteral has the same semantics as in encode_other.go.
+//
+//go:noescape
+func emitLiteral(dst, lit []byte) int
+
+// emitCopy has the same semantics as in encode_other.go.
+//
+//go:noescape
+func emitCopy(dst []byte, offset, length int) int
+
+// extendMatch has the same semantics as in encode_other.go.
+//
+//go:noescape
+func extendMatch(src []byte, i, j int) int
+
+// encodeBlock has the same semantics as in encode_other.go.
+//
+//go:noescape
+func encodeBlock(dst, src []byte) (d int)
--- a/vendor/github.com/klauspost/compress/snappy/encode_amd64.s
+++ b/vendor/github.com/klauspost/compress/snappy/encode_amd64.s
@@ -0,0 +1,730 @@
+// Copyright 2016 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build !appengine
+// +build gc
+// +build !noasm
+
+#include "textflag.h"
+
+// The XXX lines assemble on Go 1.4, 1.5 and 1.7, but not 1.6, due to a
+// Go toolchain regression. See https://github.com/golang/go/issues/15426 and
+// https://github.com/golang/snappy/issues/29
+//
+// As a workaround, the package was built with a known good assembler, and
+// those instructions were disassembled by "objdump -d" to yield the
+//	4e 0f b7 7c 5c 78       movzwq 0x78(%rsp,%r11,2),%r15
+// style comments, in AT&T asm syntax. Note that rsp here is a physical
+// register, not Go/asm's SP pseudo-register (see https://golang.org/doc/asm).
+// The instructions were then encoded as "BYTE $0x.." sequences, which assemble
+// fine on Go 1.6.
+
+// The asm code generally follows the pure Go code in encode_other.go, except
+// where marked with a "!!!".
+
+// ----------------------------------------------------------------------------
+
+// func emitLiteral(dst, lit []byte) int
+//
+// All local variables fit into registers. The register allocation:
+//	- AX	len(lit)
+//	- BX	n
+//	- DX	return value
+//	- DI	&dst[i]
+//	- R10	&lit[0]
+//
+// The 24 bytes of stack space is to call runtime·memmove.
+//
+// The unusual register allocation of local variables, such as R10 for the
+// source pointer, matches the allocation used at the call site in encodeBlock,
+// which makes it easier to manually inline this function.
+TEXT ·emitLiteral(SB), NOSPLIT, $24-56
+	MOVQ dst_base+0(FP), DI
+	MOVQ lit_base+24(FP), R10
+	MOVQ lit_len+32(FP), AX
+	MOVQ AX, DX
+	MOVL AX, BX
+	SUBL $1, BX
+
+	CMPL BX, $60
+	JLT  oneByte
+	CMPL BX, $256
+	JLT  twoBytes
+
+threeBytes:
+	MOVB $0xf4, 0(DI)
+	MOVW BX, 1(DI)
+	ADDQ $3, DI
+	ADDQ $3, DX
+	JMP  memmove
+
+twoBytes:
+	MOVB $0xf0, 0(DI)
+	MOVB BX, 1(DI)
+	ADDQ $2, DI
+	ADDQ $2, DX
+	JMP  memmove
+
+oneByte:
+	SHLB $2, BX
+	MOVB BX, 0(DI)
+	ADDQ $1, DI
+	ADDQ $1, DX
+
+memmove:
+	MOVQ DX, ret+48(FP)
+
+	// copy(dst[i:], lit)
+	//
+	// This means calling runtime·memmove(&dst[i], &lit[0], len(lit)), so we push
+	// DI, R10 and AX as arguments.
+	MOVQ DI, 0(SP)
+	MOVQ R10, 8(SP)
+	MOVQ AX, 16(SP)
+	CALL runtime·memmove(SB)
+	RET
+
+// ----------------------------------------------------------------------------
+
+// func emitCopy(dst []byte, offset, length int) int
+//
+// All local variables fit into registers. The register allocation:
+//	- AX	length
+//	- SI	&dst[0]
+//	- DI	&dst[i]
+//	- R11	offset
+//
+// The unusual register allocation of local variables, such as R11 for the
+// offset, matches the allocation used at the call site in encodeBlock, which
+// makes it easier to manually inline this function.
+TEXT ·emitCopy(SB), NOSPLIT, $0-48
+	MOVQ dst_base+0(FP), DI
+	MOVQ DI, SI
+	MOVQ offset+24(FP), R11
+	MOVQ length+32(FP), AX
+
+loop0:
+	// for length >= 68 { etc }
+	CMPL AX, $68
+	JLT  step1
+
+	// Emit a length 64 copy, encoded as 3 bytes.
+	MOVB $0xfe, 0(DI)
+	MOVW R11, 1(DI)
+	ADDQ $3, DI
+	SUBL $64, AX
+	JMP  loop0
+
+step1:
+	// if length > 64 { etc }
+	CMPL AX, $64
+	JLE  step2
+
+	// Emit a length 60 copy, encoded as 3 bytes.
+	MOVB $0xee, 0(DI)
+	MOVW R11, 1(DI)
+	ADDQ $3, DI
+	SUBL $60, AX
+
+step2:
+	// if length >= 12 || offset >= 2048 { goto step3 }
+	CMPL AX, $12
+	JGE  step3
+	CMPL R11, $2048
+	JGE  step3
+
+	// Emit the remaining copy, encoded as 2 bytes.
+	MOVB R11, 1(DI)
+	SHRL $8, R11
+	SHLB $5, R11
+	SUBB $4, AX
+	SHLB $2, AX
+	ORB  AX, R11
+	ORB  $1, R11
+	MOVB R11, 0(DI)
+	ADDQ $2, DI
+
+	// Return the number of bytes written.
+	SUBQ SI, DI
+	MOVQ DI, ret+40(FP)
+	RET
+
+step3:
+	// Emit the remaining copy, encoded as 3 bytes.
+	SUBL $1, AX
+	SHLB $2, AX
+	ORB  $2, AX
+	MOVB AX, 0(DI)
+	MOVW R11, 1(DI)
+	ADDQ $3, DI
+
+	// Return the number of bytes written.
+	SUBQ SI, DI
+	MOVQ DI, ret+40(FP)
+	RET
+
+// ----------------------------------------------------------------------------
+
+// func extendMatch(src []byte, i, j int) int
+//
+// All local variables fit into registers. The register allocation:
+//	- DX	&src[0]
+//	- SI	&src[j]
+//	- R13	&src[len(src) - 8]
+//	- R14	&src[len(src)]
+//	- R15	&src[i]
+//
+// The unusual register allocation of local variables, such as R15 for a source
+// pointer, matches the allocation used at the call site in encodeBlock, which
+// makes it easier to manually inline this function.
+TEXT ·extendMatch(SB), NOSPLIT, $0-48
+	MOVQ src_base+0(FP), DX
+	MOVQ src_len+8(FP), R14
+	MOVQ i+24(FP), R15
+	MOVQ j+32(FP), SI
+	ADDQ DX, R14
+	ADDQ DX, R15
+	ADDQ DX, SI
+	MOVQ R14, R13
+	SUBQ $8, R13
+
+cmp8:
+	// As long as we are 8 or more bytes before the end of src, we can load and
+	// compare 8 bytes at a time. If those 8 bytes are equal, repeat.
+	CMPQ SI, R13
+	JA   cmp1
+	MOVQ (R15), AX
+	MOVQ (SI), BX
+	CMPQ AX, BX
+	JNE  bsf
+	ADDQ $8, R15
+	ADDQ $8, SI
+	JMP  cmp8
+
+bsf:
+	// If those 8 bytes were not equal, XOR the two 8 byte values, and return
+	// the index of the first byte that differs. The BSF instruction finds the
+	// least significant 1 bit, the amd64 architecture is little-endian, and
+	// the shift by 3 converts a bit index to a byte index.
+	XORQ AX, BX
+	BSFQ BX, BX
+	SHRQ $3, BX
+	ADDQ BX, SI
+
+	// Convert from &src[ret] to ret.
+	SUBQ DX, SI
+	MOVQ SI, ret+40(FP)
+	RET
+
+cmp1:
+	// In src's tail, compare 1 byte at a time.
+	CMPQ SI, R14
+	JAE  extendMatchEnd
+	MOVB (R15), AX
+	MOVB (SI), BX
+	CMPB AX, BX
+	JNE  extendMatchEnd
+	ADDQ $1, R15
+	ADDQ $1, SI
+	JMP  cmp1
+
+extendMatchEnd:
+	// Convert from &src[ret] to ret.
+	SUBQ DX, SI
+	MOVQ SI, ret+40(FP)
+	RET
+
+// ----------------------------------------------------------------------------
+
+// func encodeBlock(dst, src []byte) (d int)
+//
+// All local variables fit into registers, other than "var table". The register
+// allocation:
+//	- AX	.	.
+//	- BX	.	.
+//	- CX	56	shift (note that amd64 shifts by non-immediates must use CX).
+//	- DX	64	&src[0], tableSize
+//	- SI	72	&src[s]
+//	- DI	80	&dst[d]
+//	- R9	88	sLimit
+//	- R10	.	&src[nextEmit]
+//	- R11	96	prevHash, currHash, nextHash, offset
+//	- R12	104	&src[base], skip
+//	- R13	.	&src[nextS], &src[len(src) - 8]
+//	- R14	.	len(src), bytesBetweenHashLookups, &src[len(src)], x
+//	- R15	112	candidate
+//
+// The second column (56, 64, etc) is the stack offset to spill the registers
+// when calling other functions. We could pack this slightly tighter, but it's
+// simpler to have a dedicated spill map independent of the function called.
+//
+// "var table [maxTableSize]uint16" takes up 32768 bytes of stack space. An
+// extra 56 bytes, to call other functions, and an extra 64 bytes, to spill
+// local variables (registers) during calls gives 32768 + 56 + 64 = 32888.
+TEXT ·encodeBlock(SB), 0, $32888-56
+	MOVQ dst_base+0(FP), DI
+	MOVQ src_base+24(FP), SI
+	MOVQ src_len+32(FP), R14
+
+	// shift, tableSize := uint32(32-8), 1<<8
+	MOVQ $24, CX
+	MOVQ $256, DX
+
+calcShift:
+	// for ; tableSize < maxTableSize && tableSize < len(src); tableSize *= 2 {
+	//	shift--
+	// }
+	CMPQ DX, $16384
+	JGE  varTable
+	CMPQ DX, R14
+	JGE  varTable
+	SUBQ $1, CX
+	SHLQ $1, DX
+	JMP  calcShift
+
+varTable:
+	// var table [maxTableSize]uint16
+	//
+	// In the asm code, unlike the Go code, we can zero-initialize only the
+	// first tableSize elements. Each uint16 element is 2 bytes and each MOVOU
+	// writes 16 bytes, so we can do only tableSize/8 writes instead of the
+	// 2048 writes that would zero-initialize all of table's 32768 bytes.
+	SHRQ $3, DX
+	LEAQ table-32768(SP), BX
+	PXOR X0, X0
+
+memclr:
+	MOVOU X0, 0(BX)
+	ADDQ  $16, BX
+	SUBQ  $1, DX
+	JNZ   memclr
+
+	// !!! DX = &src[0]
+	MOVQ SI, DX
+
+	// sLimit := len(src) - inputMargin
+	MOVQ R14, R9
+	SUBQ $15, R9
+
+	// !!! Pre-emptively spill CX, DX and R9 to the stack. Their values don't
+	// change for the rest of the function.
+	MOVQ CX, 56(SP)
+	MOVQ DX, 64(SP)
+	MOVQ R9, 88(SP)
+
+	// nextEmit := 0
+	MOVQ DX, R10
+
+	// s := 1
+	ADDQ $1, SI
+
+	// nextHash := hash(load32(src, s), shift)
+	MOVL  0(SI), R11
+	IMULL $0x1e35a7bd, R11
+	SHRL  CX, R11
+
+outer:
+	// for { etc }
+
+	// skip := 32
+	MOVQ $32, R12
+
+	// nextS := s
+	MOVQ SI, R13
+
+	// candidate := 0
+	MOVQ $0, R15
+
+inner0:
+	// for { etc }
+
+	// s := nextS
+	MOVQ R13, SI
+
+	// bytesBetweenHashLookups := skip >> 5
+	MOVQ R12, R14
+	SHRQ $5, R14
+
+	// nextS = s + bytesBetweenHashLookups
+	ADDQ R14, R13
+
+	// skip += bytesBetweenHashLookups
+	ADDQ R14, R12
+
+	// if nextS > sLimit { goto emitRemainder }
+	MOVQ R13, AX
+	SUBQ DX, AX
+	CMPQ AX, R9
+	JA   emitRemainder
+
+	// candidate = int(table[nextHash])
+	// XXX: MOVWQZX table-32768(SP)(R11*2), R15
+	// XXX: 4e 0f b7 7c 5c 78       movzwq 0x78(%rsp,%r11,2),%r15
+	BYTE $0x4e
+	BYTE $0x0f
+	BYTE $0xb7
+	BYTE $0x7c
+	BYTE $0x5c
+	BYTE $0x78
+
+	// table[nextHash] = uint16(s)
+	MOVQ SI, AX
+	SUBQ DX, AX
+
+	// XXX: MOVW AX, table-32768(SP)(R11*2)
+	// XXX: 66 42 89 44 5c 78       mov    %ax,0x78(%rsp,%r11,2)
+	BYTE $0x66
+	BYTE $0x42
+	BYTE $0x89
+	BYTE $0x44
+	BYTE $0x5c
+	BYTE $0x78
+
+	// nextHash = hash(load32(src, nextS), shift)
+	MOVL  0(R13), R11
+	IMULL $0x1e35a7bd, R11
+	SHRL  CX, R11
+
+	// if load32(src, s) != load32(src, candidate) { continue } break
+	MOVL 0(SI), AX
+	MOVL (DX)(R15*1), BX
+	CMPL AX, BX
+	JNE  inner0
+
+fourByteMatch:
+	// As per the encode_other.go code:
+	//
+	// A 4-byte match has been found. We'll later see etc.
+
+	// !!! Jump to a fast path for short (<= 16 byte) literals. See the comment
+	// on inputMargin in encode.go.
+	MOVQ SI, AX
+	SUBQ R10, AX
+	CMPQ AX, $16
+	JLE  emitLiteralFastPath
+
+	// ----------------------------------------
+	// Begin inline of the emitLiteral call.
+	//
+	// d += emitLiteral(dst[d:], src[nextEmit:s])
+
+	MOVL AX, BX
+	SUBL $1, BX
+
+	CMPL BX, $60
+	JLT  inlineEmitLiteralOneByte
+	CMPL BX, $256
+	JLT  inlineEmitLiteralTwoBytes
+
+inlineEmitLiteralThreeBytes:
+	MOVB $0xf4, 0(DI)
+	MOVW BX, 1(DI)
+	ADDQ $3, DI
+	JMP  inlineEmitLiteralMemmove
+
+inlineEmitLiteralTwoBytes:
+	MOVB $0xf0, 0(DI)
+	MOVB BX, 1(DI)
+	ADDQ $2, DI
+	JMP  inlineEmitLiteralMemmove
+
+inlineEmitLiteralOneByte:
+	SHLB $2, BX
+	MOVB BX, 0(DI)
+	ADDQ $1, DI
+
+inlineEmitLiteralMemmove:
+	// Spill local variables (registers) onto the stack; call; unspill.
+	//
+	// copy(dst[i:], lit)
+	//
+	// This means calling runtime·memmove(&dst[i], &lit[0], len(lit)), so we push
+	// DI, R10 and AX as arguments.
+	MOVQ DI, 0(SP)
+	MOVQ R10, 8(SP)
+	MOVQ AX, 16(SP)
+	ADDQ AX, DI              // Finish the "d +=" part of "d += emitLiteral(etc)".
+	MOVQ SI, 72(SP)
+	MOVQ DI, 80(SP)
+	MOVQ R15, 112(SP)
+	CALL runtime·memmove(SB)
+	MOVQ 56(SP), CX
+	MOVQ 64(SP), DX
+	MOVQ 72(SP), SI
+	MOVQ 80(SP), DI
+	MOVQ 88(SP), R9
+	MOVQ 112(SP), R15
+	JMP  inner1
+
+inlineEmitLiteralEnd:
+	// End inline of the emitLiteral call.
+	// ----------------------------------------
+
+emitLiteralFastPath:
+	// !!! Emit the 1-byte encoding "uint8(len(lit)-1)<<2".
+	MOVB AX, BX
+	SUBB $1, BX
+	SHLB $2, BX
+	MOVB BX, (DI)
+	ADDQ $1, DI
+
+	// !!! Implement the copy from lit to dst as a 16-byte load and store.
+	// (Encode's documentation says that dst and src must not overlap.)
+	//
+	// This always copies 16 bytes, instead of only len(lit) bytes, but that's
+	// OK. Subsequent iterations will fix up the overrun.
+	//
+	// Note that on amd64, it is legal and cheap to issue unaligned 8-byte or
+	// 16-byte loads and stores. This technique probably wouldn't be as
+	// effective on architectures that are fussier about alignment.
+	MOVOU 0(R10), X0
+	MOVOU X0, 0(DI)
+	ADDQ  AX, DI
+
+inner1:
+	// for { etc }
+
+	// base := s
+	MOVQ SI, R12
+
+	// !!! offset := base - candidate
+	MOVQ R12, R11
+	SUBQ R15, R11
+	SUBQ DX, R11
+
+	// ----------------------------------------
+	// Begin inline of the extendMatch call.
+	//
+	// s = extendMatch(src, candidate+4, s+4)
+
+	// !!! R14 = &src[len(src)]
+	MOVQ src_len+32(FP), R14
+	ADDQ DX, R14
+
+	// !!! R13 = &src[len(src) - 8]
+	MOVQ R14, R13
+	SUBQ $8, R13
+
+	// !!! R15 = &src[candidate + 4]
+	ADDQ $4, R15
+	ADDQ DX, R15
+
+	// !!! s += 4
+	ADDQ $4, SI
+
+inlineExtendMatchCmp8:
+	// As long as we are 8 or more bytes before the end of src, we can load and
+	// compare 8 bytes at a time. If those 8 bytes are equal, repeat.
+	CMPQ SI, R13
+	JA   inlineExtendMatchCmp1
+	MOVQ (R15), AX
+	MOVQ (SI), BX
+	CMPQ AX, BX
+	JNE  inlineExtendMatchBSF
+	ADDQ $8, R15
+	ADDQ $8, SI
+	JMP  inlineExtendMatchCmp8
+
+inlineExtendMatchBSF:
+	// If those 8 bytes were not equal, XOR the two 8 byte values, and return
+	// the index of the first byte that differs. The BSF instruction finds the
+	// least significant 1 bit, the amd64 architecture is little-endian, and
+	// the shift by 3 converts a bit index to a byte index.
+	XORQ AX, BX
+	BSFQ BX, BX
+	SHRQ $3, BX
+	ADDQ BX, SI
+	JMP  inlineExtendMatchEnd
+
+inlineExtendMatchCmp1:
+	// In src's tail, compare 1 byte at a time.
+	CMPQ SI, R14
+	JAE  inlineExtendMatchEnd
+	MOVB (R15), AX
+	MOVB (SI), BX
+	CMPB AX, BX
+	JNE  inlineExtendMatchEnd
+	ADDQ $1, R15
+	ADDQ $1, SI
+	JMP  inlineExtendMatchCmp1
+
+inlineExtendMatchEnd:
+	// End inline of the extendMatch call.
+	// ----------------------------------------
+
+	// ----------------------------------------
+	// Begin inline of the emitCopy call.
+	//
+	// d += emitCopy(dst[d:], base-candidate, s-base)
+
+	// !!! length := s - base
+	MOVQ SI, AX
+	SUBQ R12, AX
+
+inlineEmitCopyLoop0:
+	// for length >= 68 { etc }
+	CMPL AX, $68
+	JLT  inlineEmitCopyStep1
+
+	// Emit a length 64 copy, encoded as 3 bytes.
+	MOVB $0xfe, 0(DI)
+	MOVW R11, 1(DI)
+	ADDQ $3, DI
+	SUBL $64, AX
+	JMP  inlineEmitCopyLoop0
+
+inlineEmitCopyStep1:
+	// if length > 64 { etc }
+	CMPL AX, $64
+	JLE  inlineEmitCopyStep2
+
+	// Emit a length 60 copy, encoded as 3 bytes.
+	MOVB $0xee, 0(DI)
+	MOVW R11, 1(DI)
+	ADDQ $3, DI
+	SUBL $60, AX
+
+inlineEmitCopyStep2:
+	// if length >= 12 || offset >= 2048 { goto inlineEmitCopyStep3 }
+	CMPL AX, $12
+	JGE  inlineEmitCopyStep3
+	CMPL R11, $2048
+	JGE  inlineEmitCopyStep3
+
+	// Emit the remaining copy, encoded as 2 bytes.
+	MOVB R11, 1(DI)
+	SHRL $8, R11
+	SHLB $5, R11
+	SUBB $4, AX
+	SHLB $2, AX
+	ORB  AX, R11
+	ORB  $1, R11
+	MOVB R11, 0(DI)
+	ADDQ $2, DI
+	JMP  inlineEmitCopyEnd
+
+inlineEmitCopyStep3:
+	// Emit the remaining copy, encoded as 3 bytes.
+	SUBL $1, AX
+	SHLB $2, AX
+	ORB  $2, AX
+	MOVB AX, 0(DI)
+	MOVW R11, 1(DI)
+	ADDQ $3, DI
+
+inlineEmitCopyEnd:
+	// End inline of the emitCopy call.
+	// ----------------------------------------
+
+	// nextEmit = s
+	MOVQ SI, R10
+
+	// if s >= sLimit { goto emitRemainder }
+	MOVQ SI, AX
+	SUBQ DX, AX
+	CMPQ AX, R9
+	JAE  emitRemainder
+
+	// As per the encode_other.go code:
+	//
+	// We could immediately etc.
+
+	// x := load64(src, s-1)
+	MOVQ -1(SI), R14
+
+	// prevHash := hash(uint32(x>>0), shift)
+	MOVL  R14, R11
+	IMULL $0x1e35a7bd, R11
+	SHRL  CX, R11
+
+	// table[prevHash] = uint16(s-1)
+	MOVQ SI, AX
+	SUBQ DX, AX
+	SUBQ $1, AX
+
+	// XXX: MOVW AX, table-32768(SP)(R11*2)
+	// XXX: 66 42 89 44 5c 78       mov    %ax,0x78(%rsp,%r11,2)
+	BYTE $0x66
+	BYTE $0x42
+	BYTE $0x89
+	BYTE $0x44
+	BYTE $0x5c
+	BYTE $0x78
+
+	// currHash := hash(uint32(x>>8), shift)
+	SHRQ  $8, R14
+	MOVL  R14, R11
+	IMULL $0x1e35a7bd, R11
+	SHRL  CX, R11
+
+	// candidate = int(table[currHash])
+	// XXX: MOVWQZX table-32768(SP)(R11*2), R15
+	// XXX: 4e 0f b7 7c 5c 78       movzwq 0x78(%rsp,%r11,2),%r15
+	BYTE $0x4e
+	BYTE $0x0f
+	BYTE $0xb7
+	BYTE $0x7c
+	BYTE $0x5c
+	BYTE $0x78
+
+	// table[currHash] = uint16(s)
+	ADDQ $1, AX
+
+	// XXX: MOVW AX, table-32768(SP)(R11*2)
+	// XXX: 66 42 89 44 5c 78       mov    %ax,0x78(%rsp,%r11,2)
+	BYTE $0x66
+	BYTE $0x42
+	BYTE $0x89
+	BYTE $0x44
+	BYTE $0x5c
+	BYTE $0x78
+
+	// if uint32(x>>8) == load32(src, candidate) { continue }
+	MOVL (DX)(R15*1), BX
+	CMPL R14, BX
+	JEQ  inner1
+
+	// nextHash = hash(uint32(x>>16), shift)
+	SHRQ  $8, R14
+	MOVL  R14, R11
+	IMULL $0x1e35a7bd, R11
+	SHRL  CX, R11
+
+	// s++
+	ADDQ $1, SI
+
+	// break out of the inner1 for loop, i.e. continue the outer loop.
+	JMP outer
+
+emitRemainder:
+	// if nextEmit < len(src) { etc }
+	MOVQ src_len+32(FP), AX
+	ADDQ DX, AX
+	CMPQ R10, AX
+	JEQ  encodeBlockEnd
+
+	// d += emitLiteral(dst[d:], src[nextEmit:])
+	//
+	// Push args.
+	MOVQ DI, 0(SP)
+	MOVQ $0, 8(SP)   // Unnecessary, as the callee ignores it, but conservative.
+	MOVQ $0, 16(SP)  // Unnecessary, as the callee ignores it, but conservative.
+	MOVQ R10, 24(SP)
+	SUBQ R10, AX
+	MOVQ AX, 32(SP)
+	MOVQ AX, 40(SP)  // Unnecessary, as the callee ignores it, but conservative.
+
+	// Spill local variables (registers) onto the stack; call; unspill.
+	MOVQ DI, 80(SP)
+	CALL ·emitLiteral(SB)
+	MOVQ 80(SP), DI
+
+	// Finish the "d +=" part of "d += emitLiteral(etc)".
+	ADDQ 48(SP), DI
+
+encodeBlockEnd:
+	MOVQ dst_base+0(FP), AX
+	SUBQ AX, DI
+	MOVQ DI, d+48(FP)
+	RET
--- a/vendor/github.com/klauspost/compress/snappy/encode_other.go
+++ b/vendor/github.com/klauspost/compress/snappy/encode_other.go
@@ -0,0 +1,238 @@
+// Copyright 2016 The Snappy-Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+// +build !amd64 appengine !gc noasm
+
+package snappy
+
+func load32(b []byte, i int) uint32 {
+	b = b[i : i+4 : len(b)] // Help the compiler eliminate bounds checks on the next line.
+	return uint32(b[0]) | uint32(b[1])<<8 | uint32(b[2])<<16 | uint32(b[3])<<24
+}
+
+func load64(b []byte, i int) uint64 {
+	b = b[i : i+8 : len(b)] // Help the compiler eliminate bounds checks on the next line.
+	return uint64(b[0]) | uint64(b[1])<<8 | uint64(b[2])<<16 | uint64(b[3])<<24 |
+		uint64(b[4])<<32 | uint64(b[5])<<40 | uint64(b[6])<<48 | uint64(b[7])<<56
+}
+
+// emitLiteral writes a literal chunk and returns the number of bytes written.
+//
+// It assumes that:
+//	dst is long enough to hold the encoded bytes
+//	1 <= len(lit) && len(lit) <= 65536
+func emitLiteral(dst, lit []byte) int {
+	i, n := 0, uint(len(lit)-1)
+	switch {
+	case n < 60:
+		dst[0] = uint8(n)<<2 | tagLiteral
+		i = 1
+	case n < 1<<8:
+		dst[0] = 60<<2 | tagLiteral
+		dst[1] = uint8(n)
+		i = 2
+	default:
+		dst[0] = 61<<2 | tagLiteral
+		dst[1] = uint8(n)
+		dst[2] = uint8(n >> 8)
+		i = 3
+	}
+	return i + copy(dst[i:], lit)
+}
+
+// emitCopy writes a copy chunk and returns the number of bytes written.
+//
+// It assumes that:
+//	dst is long enough to hold the encoded bytes
+//	1 <= offset && offset <= 65535
+//	4 <= length && length <= 65535
+func emitCopy(dst []byte, offset, length int) int {
+	i := 0
+	// The maximum length for a single tagCopy1 or tagCopy2 op is 64 bytes. The
+	// threshold for this loop is a little higher (at 68 = 64 + 4), and the
+	// length emitted down below is is a little lower (at 60 = 64 - 4), because
+	// it's shorter to encode a length 67 copy as a length 60 tagCopy2 followed
+	// by a length 7 tagCopy1 (which encodes as 3+2 bytes) than to encode it as
+	// a length 64 tagCopy2 followed by a length 3 tagCopy2 (which encodes as
+	// 3+3 bytes). The magic 4 in the 64±4 is because the minimum length for a
+	// tagCopy1 op is 4 bytes, which is why a length 3 copy has to be an
+	// encodes-as-3-bytes tagCopy2 instead of an encodes-as-2-bytes tagCopy1.
+	for length >= 68 {
+		// Emit a length 64 copy, encoded as 3 bytes.
+		dst[i+0] = 63<<2 | tagCopy2
+		dst[i+1] = uint8(offset)
+		dst[i+2] = uint8(offset >> 8)
+		i += 3
+		length -= 64
+	}
+	if length > 64 {
+		// Emit a length 60 copy, encoded as 3 bytes.
+		dst[i+0] = 59<<2 | tagCopy2
+		dst[i+1] = uint8(offset)
+		dst[i+2] = uint8(offset >> 8)
+		i += 3
+		length -= 60
+	}
+	if length >= 12 || offset >= 2048 {
+		// Emit the remaining copy, encoded as 3 bytes.
+		dst[i+0] = uint8(length-1)<<2 | tagCopy2
+		dst[i+1] = uint8(offset)
+		dst[i+2] = uint8(offset >> 8)
+		return i + 3
+	}
+	// Emit the remaining copy, encoded as 2 bytes.
+	dst[i+0] = uint8(offset>>8)<<5 | uint8(length-4)<<2 | tagCopy1
+	dst[i+1] = uint8(offset)
+	return i + 2
+}
+
+// extendMatch returns the largest k such that k <= len(src) and that
+// src[i:i+k-j] and src[j:k] have the same contents.
+//
+// It assumes that:
+//	0 <= i && i < j && j <= len(src)
+func extendMatch(src []byte, i, j int) int {
+	for ; j < len(src) && src[i] == src[j]; i, j = i+1, j+1 {
+	}
+	return j
+}
+
+func hash(u, shift uint32) uint32 {
+	return (u * 0x1e35a7bd) >> shift
+}
+
+// encodeBlock encodes a non-empty src to a guaranteed-large-enough dst. It
+// assumes that the varint-encoded length of the decompressed bytes has already
+// been written.
+//
+// It also assumes that:
+//	len(dst) >= MaxEncodedLen(len(src)) &&
+// 	minNonLiteralBlockSize <= len(src) && len(src) <= maxBlockSize
+func encodeBlock(dst, src []byte) (d int) {
+	// Initialize the hash table. Its size ranges from 1<<8 to 1<<14 inclusive.
+	// The table element type is uint16, as s < sLimit and sLimit < len(src)
+	// and len(src) <= maxBlockSize and maxBlockSize == 65536.
+	const (
+		maxTableSize = 1 << 14
+		// tableMask is redundant, but helps the compiler eliminate bounds
+		// checks.
+		tableMask = maxTableSize - 1
+	)
+	shift := uint32(32 - 8)
+	for tableSize := 1 << 8; tableSize < maxTableSize && tableSize < len(src); tableSize *= 2 {
+		shift--
+	}
+	// In Go, all array elements are zero-initialized, so there is no advantage
+	// to a smaller tableSize per se. However, it matches the C++ algorithm,
+	// and in the asm versions of this code, we can get away with zeroing only
+	// the first tableSize elements.
+	var table [maxTableSize]uint16
+
+	// sLimit is when to stop looking for offset/length copies. The inputMargin
+	// lets us use a fast path for emitLiteral in the main loop, while we are
+	// looking for copies.
+	sLimit := len(src) - inputMargin
+
+	// nextEmit is where in src the next emitLiteral should start from.
+	nextEmit := 0
+
+	// The encoded form must start with a literal, as there are no previous
+	// bytes to copy, so we start looking for hash matches at s == 1.
+	s := 1
+	nextHash := hash(load32(src, s), shift)
+
+	for {
+		// Copied from the C++ snappy implementation:
+		//
+		// Heuristic match skipping: If 32 bytes are scanned with no matches
+		// found, start looking only at every other byte. If 32 more bytes are
+		// scanned (or skipped), look at every third byte, etc.. When a match
+		// is found, immediately go back to looking at every byte. This is a
+		// small loss (~5% performance, ~0.1% density) for compressible data
+		// due to more bookkeeping, but for non-compressible data (such as
+		// JPEG) it's a huge win since the compressor quickly "realizes" the
+		// data is incompressible and doesn't bother looking for matches
+		// everywhere.
+		//
+		// The "skip" variable keeps track of how many bytes there are since
+		// the last match; dividing it by 32 (ie. right-shifting by five) gives
+		// the number of bytes to move ahead for each iteration.
+		skip := 32
+
+		nextS := s
+		candidate := 0
+		for {
+			s = nextS
+			bytesBetweenHashLookups := skip >> 5
+			nextS = s + bytesBetweenHashLookups
+			skip += bytesBetweenHashLookups
+			if nextS > sLimit {
+				goto emitRemainder
+			}
+			candidate = int(table[nextHash&tableMask])
+			table[nextHash&tableMask] = uint16(s)
+			nextHash = hash(load32(src, nextS), shift)
+			if load32(src, s) == load32(src, candidate) {
+				break
+			}
+		}
+
+		// A 4-byte match has been found. We'll later see if more than 4 bytes
+		// match. But, prior to the match, src[nextEmit:s] are unmatched. Emit
+		// them as literal bytes.
+		d += emitLiteral(dst[d:], src[nextEmit:s])
+
+		// Call emitCopy, and then see if another emitCopy could be our next
+		// move. Repeat until we find no match for the input immediately after
+		// what was consumed by the last emitCopy call.
+		//
+		// If we exit this loop normally then we need to call emitLiteral next,
+		// though we don't yet know how big the literal will be. We handle that
+		// by proceeding to the next iteration of the main loop. We also can
+		// exit this loop via goto if we get close to exhausting the input.
+		for {
+			// Invariant: we have a 4-byte match at s, and no need to emit any
+			// literal bytes prior to s.
+			base := s
+
+			// Extend the 4-byte match as long as possible.
+			//
+			// This is an inlined version of:
+			//	s = extendMatch(src, candidate+4, s+4)
+			s += 4
+			for i := candidate + 4; s < len(src) && src[i] == src[s]; i, s = i+1, s+1 {
+			}
+
+			d += emitCopy(dst[d:], base-candidate, s-base)
+			nextEmit = s
+			if s >= sLimit {
+				goto emitRemainder
+			}
+
+			// We could immediately start working at s now, but to improve
+			// compression we first update the hash table at s-1 and at s. If
+			// another emitCopy is not our next move, also calculate nextHash
+			// at s+1. At least on GOARCH=amd64, these three hash calculations
+			// are faster as one load64 call (with some shifts) instead of
+			// three load32 calls.
+			x := load64(src, s-1)
+			prevHash := hash(uint32(x>>0), shift)
+			table[prevHash&tableMask] = uint16(s - 1)
+			currHash := hash(uint32(x>>8), shift)
+			candidate = int(table[currHash&tableMask])
+			table[currHash&tableMask] = uint16(s)
+			if uint32(x>>8) != load32(src, candidate) {
+				nextHash = hash(uint32(x>>16), shift)
+				s++
+				break
+			}
+		}
+	}
+
+emitRemainder:
+	if nextEmit < len(src) {
+		d += emitLiteral(dst[d:], src[nextEmit:])
+	}
+	return d
+}
--- a/vendor/github.com/klauspost/compress/snappy/golden_test.go
+++ b/vendor/github.com/klauspost/compress/snappy/golden_test.go
--- a/Show More
+++ b/Show More
				`@@ -0,0 +1 @@`
				3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067982148086513282306647093844609550582231725359408128481117450284102701938521105559644622948954930381964428810975665933446128475648233786783165271201909145648566923460348610454326648213393607260249141273724587006606315588174881520920962829254091715364367892590360011330530548820466521384146951941511609433057270365759591953092186117381932611793105118548074462379962749567351885752724891227938183011949129833673362440656643086021394946395224737190702179860943702770539217176293176752384674818467669405132000568127145263560827785771342757789609173637178721468440901224953430146549585371050792279689258923542019956112129021960864034418159813629774771309960518707211349999998372978049951059731732816096318595024459455346908302642522308253344685035261931188171010003137838752886587533208381420617177669147303598253490428755468731159562863882353787593751957781857780532171226806613001927876611195909216420198938095257201065485863278865936153381827968230301952035301852968995773622599413891249721775283479131515574857242454150695950829533116861727855889075098381754637464939319255060400927701671139009848824012858361603563707660104710181942955596198946767837449448255379774726847104047534646208046684259069491293313677028989152104752162056966024058038150193511253382430035587640247496473263914199272604269922796782354781636009341721641219924586315030286182974555706749838505494588586926995690927210797509302955321165344987202755960236480665499119881834797753566369807426542527862551818417574672890977772793800081647060016145249192173217214772350141441973568548161361157352552133475741849468438523323907394143334547762416862518983569485562099219222184272550254256887671790494601653466804988627232791786085784383827967976681454100953883786360950680064225125205117392984896084128488626945604241965285022210661186306744278622039194945047123713786960956364371917287467764657573962413890865832645995813390478027590099465764078951269468398352595709825822620522489407726719478268482601476990902640136394437455305068203496252451749399651431429809190659250937221696461515709858387410597885959772975498930161753928468138268683868942774155991855925245953959431049972524680845987273644695848653836736222626099124608051243884390451244136549762780797715691435997700129616089441694868555848406353422072225828488648158456028506016842739452267467678895252138522549954666727823986456596116354886230577456498035593634568174324112515076069479451096596094025228879710893145669136867228748940560101503308617928680920874760917824938589009714909675985261365549781893129784821682998948722658804857564014270477555132379641451523746234364542858444795265867821051141354735739523113427166102135969536231442952484937187110145765403590279934403742007310578539062198387447808478489683321445713868751943506430218453191048481005370614680674919278191197939952061419663428754440643745123718192179998391015919561814675142691239748940907186494231961567945208095146550225231603881930142093762137855956638937787083039069792077346722182562599661501421503068038447734549202605414665925201497442850732518666002132434088190710486331734649651453905796268561005508106658796998163574736384052571459102897064140110971206280439039759515677157700420337869936007230558763176359421873125147120532928191826186125867321579198414848829164470609575270695722091756711672291098169091528017350671274858322287183520935396572512108357915136988209144421006751033467110314126711136990865851639831501970165151168517143765761835155650884909989859982387345528331635507647918535893226185489632132933089857064204675259070915481416549859461637180
				`@@ -0,0 +1 @@`
				<1C>_K<5F>0<1C><><EFBFBD><05><0E><>`K<><4B>0Aasě)^<5E>H<EFBFBD><48><EFBFBD><EFBFBD><EFBFBD>Iɟb߻<><DFBB>_><3E>4
				`@@ -0,0 +1 @@`
				`00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000`