Почему я получаю отрицательное сжатие для Gzip, Snappy, Smile?

Я пытался выяснить, какое сжатие подходит для моего приложения для сжатия строки JSON. Целью здесь является сжатие объекта JSON перед сохранением в REDIS.

Вот мои результаты

Пробная версия сжатия Gzip

compression percent : -8.7719345 %
to json time : 151 microseconds
pure saveable compression : 3326 microseconds
gzip compression+convert to json time : 3477 microseconds
gzip de-compression to string time : 537 microseconds

Пробная версия Snappy Compression

compression percent : -22.807014 %
to json time : 58 microseconds
pure saveable compression : 259490 microseconds
snappy compression+convert to json time : 259549 microseconds
snappy de-compression to string time : 84 microseconds

Smile (msgpack) Пробная версия сжатия

compression percent : -24.561401 %
smile compression time : 3314 microseconds
smile de-compression time : n/a 

Однако что довольно странно, так это то, что Snappy должен работать намного быстрее (из того, что я читал), только распаковка выполняется быстро, а сжатие занимает больше времени. Также странно, что улыбка создает более длинную сохраняемую строку. Может ли кто-нибудь указать, почему или что я здесь делаю неправильно?

Вот мой код для этой пробной версии

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.smile.SmileFactory;
import com.fasterxml.jackson.dataformat.smile.SmileGenerator;
import com.fasterxml.jackson.dataformat.smile.SmileParser;
import org.xerial.snappy.Snappy;

import javax.xml.bind.DatatypeConverter;
import java.io.*;
import java.util.concurrent.TimeUnit;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

public class CompressionTrials {

    public static void main(String[] args) {
        jsonCompressionTrial();
    }

        public static void jsonCompressionTrial(){
            SimpleDto originalDto = new SimpleDto();
            originalDto.setFname("MyFirstName");
            originalDto.setLname("MyLastName");
            originalDto.setDescription("This is a long description. I am trying out compression options for JSON. Hopefully the results will help me decide on one approach");
            originalDto.setCity("MyCity");
            originalDto.setAge(36);
            originalDto.setZip(2424);

            gzipCompressionTrial(originalDto);
            snappyCompressionTrial(originalDto);
            smileCompressionTrial(originalDto);

        }

        public static void gzipCompressionTrial(SimpleDto simpleDto){
            if(simpleDto == null){
                return;
            }

            ObjectMapper mapper = new ObjectMapper();
            String originalJsonString = null;
            long compressionAndConversionMicroSeconds = 0;
            long toJsonMicroSeconds = 0;
            long compressionMicroSeconds = 0;
            long decompressionMicroSeconds = 0;
            SimpleDto restoredDto = null;
            String restoredDtoJson = null;
            try {
                mapper.writeValueAsString(simpleDto);
                long endConversionTime = 0;
                long startTimeCompressionAndConvesion = System.nanoTime();
                originalJsonString = mapper.writeValueAsString(simpleDto);
                endConversionTime = System.nanoTime();
                byte[] compressedBytes = gzipCompress(originalJsonString);
                String compressedStringToSave = bytesToStringBase64(compressedBytes);
                long endTimeCompression = System.nanoTime();
                long startCompressionTime = endConversionTime;
                toJsonMicroSeconds =  TimeUnit.NANOSECONDS.toMicros((endConversionTime-startTimeCompressionAndConvesion));
                compressionMicroSeconds =  TimeUnit.NANOSECONDS.toMicros((endTimeCompression-startCompressionTime));
                compressionAndConversionMicroSeconds =  TimeUnit.NANOSECONDS.toMicros((endTimeCompression-startTimeCompressionAndConvesion));

                long startTimeDecompression = System.nanoTime();
                String unCompressedString = gzipDecompress(compressedBytes);
                long endTimeDecompression = System.nanoTime();
                decompressionMicroSeconds = TimeUnit.NANOSECONDS.toMicros(endTimeDecompression-startTimeDecompression); // TimeUnit.MILLISECONDS.convert((endTimeDecompression - startTimeDecompression), TimeUnit.NANOSECONDS);

                int originalLength = originalJsonString.toString().length();
                int compressedLength = compressedStringToSave.toString().length();
                float compressionPercent = 100 - (( (float)compressedLength / (float)originalLength ) * 100);

                restoredDto = mapper.readValue(originalJsonString, SimpleDto.class);
                restoredDtoJson = mapper.writeValueAsString(restoredDto);

                System.out.println("============================================================================================== ");
                System.out.println("  Gzip Compression Trial");
                System.out.println("----------------------------------------------------------------------------------------------");
    //            System.out.println("origin dto as json : " + originalJsonString );
    //            System.out.println( "original dto-json string length : " + originalLength);
    //            System.out.println( "compressed string length : " + compressedLength );
    //            System.out.println( "uncompressed json string : " + unCompressedString );
    //            System.out.println( " restored dto as json : " + restoredDtoJson );
    //            System.out.println( " is before-compressed = uncompressed : " + unCompressedString.equals(originalJsonString) );
    //            System.out.println( " is restored object json = original object json : " + originalJsonString.equals(restoredDtoJson) );
    //            System.out.println("----------------------------------------------------------------------------------------------");
                System.out.println("compression percent : " + compressionPercent + " %" );
                System.out.println("to json time : " + toJsonMicroSeconds + " microseconds" );
                System.out.println(" pure saveable compression : " + compressionMicroSeconds + " microseconds" );
                System.out.println("gzip compression+convert to json time : " + compressionAndConversionMicroSeconds + " microseconds" );
                System.out.println("gzip de-compression to string time : " + decompressionMicroSeconds + " microseconds" );
                System.out.println("============================================================================================== ");

            } catch (IOException e) {
                e.printStackTrace();
            } catch (Exception e) {
                e.printStackTrace();
            }


        }

        public static void smileCompressionTrial(SimpleDto simpleDto){
            if(simpleDto == null){
                return;
            }

            ObjectMapper mapper = new ObjectMapper();
            ObjectMapper smileMapper = getSmileObjectMapper();
            String originalJsonString = null;
            try {
                originalJsonString = mapper.writeValueAsString(simpleDto);
            } catch (JsonProcessingException e) {
                e.printStackTrace();
                return;
            }
            long compressionMicroSeconds = 0;
            long decompressionMicroSeconds = 0;
            SimpleDto restoredDto = null;
            String restoredDtoJson = null;

            try {
                mapper.writeValueAsString(simpleDto);
                long startTimeCompression = System.nanoTime();
                byte[] compressedBytes = smileMapper.writeValueAsBytes(simpleDto);
                //String compressedStringToSave = new String(compressedBytes, "UTF-8");// bytesToStringBase64(compressedBytes);
                String compressedStringToSave = bytesToStringBase64(compressedBytes);
    //            System.out.println("smile compressed : " + compressedStringToSave);
    //            System.out.println("original length : " + originalJsonString.length() );
    //            System.out.println("length : " + compressedStringToSave.length() );
                long endTimeCompression = System.nanoTime();
                compressionMicroSeconds =  TimeUnit.NANOSECONDS.toMicros((endTimeCompression-startTimeCompression)); //TimeUnit.MILLISECONDS.convert((endTimeCompression - startTimeCompression), TimeUnit.NANOSECONDS);

    //            long startTimeDecompression = System.nanoTime();
    //            String unCompressedString = gzipDecompress(compressedBytes);
    //            long endTimeDecompression = System.nanoTime();
    //            decompressionMicroSeconds = TimeUnit.NANOSECONDS.toMicros(endTimeDecompression-startTimeDecompression); // TimeUnit.MILLISECONDS.convert((endTimeDecompression - startTimeDecompression), TimeUnit.NANOSECONDS);


                int originalLength = originalJsonString.toString().length();
                int compressedLength = compressedStringToSave.toString().length();
                float compressionPercent = 100 - (( (float)compressedLength / (float)originalLength ) * 100);

                restoredDto = smileMapper.readValue( stringToBytesBase64(compressedStringToSave) , SimpleDto.class);
                //restoredDto = smileMapper.readValue( compressedStringToSave.getBytes("UTF-8") , SimpleDto.class);
                restoredDtoJson = mapper.writeValueAsString(restoredDto);

                System.out.println("============================================================================================== ");
                System.out.println("  Smile Compression Trial");
                System.out.println("----------------------------------------------------------------------------------------------");
    //            System.out.println("origin dto as json : " + originalJsonString );
    //            System.out.println( "original dto-json string length : " + originalLength);
    //            System.out.println( "compressed string length : " + compressedLength );
    //            System.out.println( "uncompressed json string : n/a" /*+ unCompressedString*/ );
    //            System.out.println( " restored dto as json : " + restoredDtoJson );
    //            System.out.println( " is before-compressed = uncompressed : n/a " /*+ unCompressedString.equals(originalJsonString)*/ );
    //            System.out.println( " is restored object json = original object json : " + originalJsonString.equals(restoredDtoJson) );
    //            System.out.println("----------------------------------------------------------------------------------------------");
                System.out.println("compression percent : " + compressionPercent + " %" );
                System.out.println("smile compression time : " + compressionMicroSeconds + " microseconds" );
                System.out.println("smile de-compression time : n/a " /*+ decompressionMicroSeconds + " microseconds"*/ );
                System.out.println("============================================================================================== ");

            } catch (IOException e) {
                e.printStackTrace();
            } catch (Exception e) {
                e.printStackTrace();
            }


        }

        public static void snappyCompressionTrial(SimpleDto simpleDto) {
            if (simpleDto == null) {
                return;
            }

            ObjectMapper mapper = new ObjectMapper();
            String originalJsonString = null;
            long compressionAndConversionMicroSeconds = 0;
            long toJsonMicroSeconds = 0;
            long compressionMicroSeconds = 0;
            long decompressionMicroSeconds = 0;
            SimpleDto restoredDto = null;
            String restoredDtoJson = null;
            try {
                mapper.writeValueAsString(simpleDto);
                long endConversionTime = 0;
                long startTimeCompressionAndConvesion = System.nanoTime();
                originalJsonString = mapper.writeValueAsString(simpleDto);
                endConversionTime = System.nanoTime();
                byte[] compressedBytes = snappyCompress(originalJsonString);
                String compressedStringToSave = bytesToStringBase64(compressedBytes);
                long endTimeCompression = System.nanoTime();
                long startCompressionTime = endConversionTime;
                toJsonMicroSeconds = TimeUnit.NANOSECONDS.toMicros((endConversionTime - startTimeCompressionAndConvesion));
                compressionMicroSeconds = TimeUnit.NANOSECONDS.toMicros((endTimeCompression - startCompressionTime));
                compressionAndConversionMicroSeconds = TimeUnit.NANOSECONDS.toMicros((endTimeCompression - startTimeCompressionAndConvesion));

                long startTimeDecompression = System.nanoTime();
                byte[] unCompressedBytes = snappyUncompress(compressedBytes);
                String unCompressedString = bytesToStringUtf8(unCompressedBytes);
                long endTimeDecompression = System.nanoTime();
                decompressionMicroSeconds = TimeUnit.NANOSECONDS.toMicros(endTimeDecompression - startTimeDecompression);

                int originalLength = originalJsonString.toString().length();
                int compressedLength = compressedStringToSave.toString().length();
                float compressionPercent = 100 - (((float) compressedLength / (float) originalLength) * 100);

                //restoredDto = mapper.readValue(originalJsonString, SimpleDto.class);
                restoredDto = mapper.readValue(unCompressedBytes, SimpleDto.class);
                restoredDtoJson = mapper.writeValueAsString(restoredDto);

                System.out.println("============================================================================================== ");
                System.out.println("  Snappy Compression Trial");
                System.out.println("----------------------------------------------------------------------------------------------");
                //            System.out.println("origin dto as json : " + originalJsonString );
                //            System.out.println( "original dto-json string length : " + originalLength);
                //            System.out.println( "compressed string length : " + compressedLength );
                //            System.out.println( "uncompressed json string : " + unCompressedString );
                //            System.out.println( " restored dto as json : " + restoredDtoJson );
                //            System.out.println( " is before-compressed = uncompressed : " + unCompressedString.equals(originalJsonString) );
                //            System.out.println( " is restored object json = original object json : " + originalJsonString.equals(restoredDtoJson) );
                //            System.out.println("----------------------------------------------------------------------------------------------");
                System.out.println("compression percent : " + compressionPercent + " %");
                System.out.println("to json time : " + toJsonMicroSeconds + " microseconds");
                System.out.println(" pure saveable compression : " + compressionMicroSeconds + " microseconds");
                System.out.println("gzip compression+convert to json time : " + compressionAndConversionMicroSeconds + " microseconds");
                System.out.println("gzip de-compression to string time : " + decompressionMicroSeconds + " microseconds");
                System.out.println("============================================================================================== ");

            } catch (IOException e) {
                e.printStackTrace();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

        private static ObjectMapper getSmileObjectMapper() {
            SmileFactory smileFactory = new SmileFactory();
            smileFactory.configure(SmileGenerator.Feature.CHECK_SHARED_NAMES,true);
            smileFactory.configure(SmileGenerator.Feature.CHECK_SHARED_STRING_VALUES,true);
            smileFactory.configure(SmileGenerator.Feature.ENCODE_BINARY_AS_7BIT,true);
            smileFactory.configure(SmileGenerator.Feature.WRITE_HEADER,true);
            smileFactory.configure(SmileGenerator.Feature.WRITE_END_MARKER,false);
            smileFactory.configure(SmileParser.Feature.REQUIRE_HEADER,false);
            return new ObjectMapper(smileFactory);
        }

        public static byte[] gzipCompress(String str) throws IOException {
            if (str == null || str.length() == 0) {
                return null;
            }
            ByteArrayOutputStream out = new ByteArrayOutputStream();
            GZIPOutputStream gzip = new GZIPOutputStream(out);
            gzip.write(str.getBytes());
            gzip.close();
            return out.toByteArray();
    //        String outStr = out.toString("UTF-8");
    //        return outStr;
        }

        public static String gzipDecompress(byte[] bytes) throws Exception {
            if (bytes == null || bytes.length == 0) {
                return null;
            }
            GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
            BufferedReader bf = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
            String outStr = "";
            String line;
            while ((line=bf.readLine())!=null) {
                outStr += line;
            }
            return outStr;
        }

        public static byte[] snappyCompress(String stringData) throws IOException {
            return Snappy.compress(stringData);
        }

        public static byte[] snappyUncompress(byte[] bytes) throws IOException {
            return Snappy.uncompress(bytes);
        }

        private static String bytesToStringBase64(byte[] bytes){
            return DatatypeConverter.printBase64Binary(bytes);
        }

        private static byte[] stringToBytesBase64(String dataString){
            return DatatypeConverter.parseBase64Binary(dataString);
        }

        private static String bytesToStringUtf8(byte[] bytes) throws UnsupportedEncodingException {
            return new String(bytes, "UTF-8");
        }

        private static byte[] stringToBytesUtf8(String dataString) throws UnsupportedEncodingException {
            return dataString.getBytes("UTF-8");
        }


}

Детали среды: Windows 7, процессор i7 2,4 Gz, 16 ГБ ОЗУ, Java 8

Версии используемых библиотек:

<dependency>
    <groupId>com.fasterxml.jackson.dataformat</groupId>
    <artifactId>jackson-dataformat-smile</artifactId>
    <version>2.6.4</version>
</dependency>
<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
    <version>2.6.4</version>
</dependency>

<dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
    <scope>provided</scope>
    <version>1.16.6</version>
</dependency>

<dependency>
    <groupId>org.xerial.snappy</groupId>
    <artifactId>snappy-java</artifactId>
    <version>1.1.2</version>
</dependency>

*** Это не тест, а просто личное испытание, чтобы выбрать стратегию сжатия для моего варианта использования.

Пожалуйста, дайте мне знать, если кто-нибудь заметит какую-либо ошибку в моей пробной версии

Обновление:

Ниже приведен более простой код, чтобы попробовать

    public static void stringCompressionTrial(){
        String string = "I am what I am hhhhhhhhhhhhhhhhhhhhhhhhhhhhh"
                + "bjggujhhhhhhhhh"
                + "rggggggggggggggggggggggggg"
                + "esfffffffffffffffffffffffffffffff"
                + "esffffffffffffffffffffffffffffffff"
                + "esfekfgy enter code here`etd`enter code here wdd"
                + "heljwidgutwdbwdq8d"
                + "skdfgysrdsdnjsvfyekbdsgcu"
                +"jbujsbjvugsduddbdj";

        // uncomment below to use the json
//        SimpleDto originalDto = new SimpleDto();
//        originalDto.setFname("MyFirstName");
//        originalDto.setLname("MySecondName");
//        originalDto.setDescription("This is a long description. I am trying out compression options for JSON. Hopefully the results will help me decide on one approach");
//        originalDto.setCity("MyCity");
//        originalDto.setAge(36);
//        originalDto.setZip(1111);
//        ObjectMapper mapper = new ObjectMapper();
//        try {
//            string = mapper.writeValueAsString(originalDto);
//        } catch (JsonProcessingException e) {
//            e.printStackTrace();
//        }

        byte[] compressedBytes = null;
        String compressedString = null;
        try {
            compressedBytes = gzipCompress(string);
            compressedString = bytesToStringBase64(compressedBytes);
            System.out.println("after gzipDecompress:" + compressedString);
            //String decomp = gzipDecompress(compressedBytes);
            String decompressedString = gzipDecompress( stringToBytesBase64(compressedString) );
            System.out.println("decompressed string  : " + decompressedString);
            System.out.println( " original string length : " + string.length());
            System.out.println( " compressedString length : " + compressedString.length() );

        } catch (IOException e) {
            e.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

person R K Punjal    schedule 16.12.2015    source источник
comment
Можно ли добавить оболочку сжатия для джедаев, как это сделано для салата здесь: /d8c51944f8188197f54f ?   -  person R K Punjal    schedule 16.12.2015
comment
Как ни странно, gzip сжимает строки, когда это не JSON.   -  person R K Punjal    schedule 16.12.2015


Ответы (1)


Потому что вы пытаетесь сжать короткие строки. Сжатию требуется больше данных, чтобы найти избыточность и воспользоваться преимуществами перекоса частот символов.

person Mark Adler    schedule 17.12.2015