your programing

std :: vector에서 텍스트 파일로 데이터를 쓰는 빠른 방법

lovepro 2021. 1. 5. 19:50
반응형

std :: vector에서 텍스트 파일로 데이터를 쓰는 빠른 방법


저는 현재 벡터에서 다음과 같은 텍스트 파일로 복식 집합을 씁니다.

std::ofstream fout;
fout.open("vector.txt");

for (l = 0; l < vector.size(); l++)
    fout << std::setprecision(10) << vector.at(l) << std::endl;

fout.close();

그러나 이것은 완료하는 데 많은 시간이 걸립니다. 이를 수행하는 더 빠르고 효율적인 방법이 있습니까? 나는 그것을보고 배우고 싶습니다.


알고리즘은 두 부분으로 구성됩니다.

  1. 이중 숫자를 문자열 또는 문자 버퍼로 직렬화합니다.

  2. 결과를 파일에 씁니다.

첫 번째 항목은 sprintf 또는 fmt 를 사용하여 개선 할 수 있습니다 (> 20 %) . 두 번째 항목은 결과를 출력 파일에 쓰기 전에 결과를 버퍼에 캐싱하거나 출력 파일 스트림 버퍼 크기를 확장하여 속도를 높일 수 있습니다. std :: endl 은 "\ n"을 사용하는 것보다 훨씬 느리 므로 사용하지 마십시오 . 그래도 더 빠르게 만들고 싶다면 이진 형식으로 데이터를 작성하십시오. 아래는 제안 된 솔루션과 Edgar Rokyan의 솔루션을 포함하는 완전한 코드 샘플입니다. 또한 테스트 코드에 Ben Voigt 및 Matthieu M 제안을 포함했습니다.

#include <algorithm>
#include <cstdlib>
#include <fstream>
#include <iomanip>
#include <iostream>
#include <iterator>
#include <vector>

// https://github.com/fmtlib/fmt
#include "fmt/format.h"

// http://uscilab.github.io/cereal/
#include "cereal/archives/binary.hpp"
#include "cereal/archives/json.hpp"
#include "cereal/archives/portable_binary.hpp"
#include "cereal/archives/xml.hpp"
#include "cereal/types/string.hpp"
#include "cereal/types/vector.hpp"

// https://github.com/DigitalInBlue/Celero
#include "celero/Celero.h"

template <typename T> const char* getFormattedString();
template<> const char* getFormattedString<double>(){return "%g\n";}
template<> const char* getFormattedString<float>(){return "%g\n";}
template<> const char* getFormattedString<int>(){return "%d\n";}
template<> const char* getFormattedString<size_t>(){return "%lu\n";}


namespace {
    constexpr size_t LEN = 32;

    template <typename T> std::vector<T> create_test_data(const size_t N) {
        std::vector<T> data(N);
        for (size_t idx = 0; idx < N; ++idx) {
            data[idx] = idx;
        }
        return data;
    }

    template <typename Iterator> auto toVectorOfChar(Iterator begin, Iterator end) {
        char aLine[LEN];
        std::vector<char> buffer;
        buffer.reserve(std::distance(begin, end) * LEN);
        const char* fmtStr = getFormattedString<typename std::iterator_traits<Iterator>::value_type>();
        std::for_each(begin, end, [&buffer, &aLine, &fmtStr](const auto value) {
            sprintf(aLine, fmtStr, value);
            for (size_t idx = 0; aLine[idx] != 0; ++idx) {
                buffer.push_back(aLine[idx]);
            }
        });
        return buffer;
    }

    template <typename Iterator>
    auto toStringStream(Iterator begin, Iterator end, std::stringstream &buffer) {
        char aLine[LEN];
        const char* fmtStr = getFormattedString<typename std::iterator_traits<Iterator>::value_type>();
        std::for_each(begin, end, [&buffer, &aLine, &fmtStr](const auto value) {            
            sprintf(aLine, fmtStr, value);
            buffer << aLine;
        });
    }

    template <typename Iterator> auto toMemoryWriter(Iterator begin, Iterator end) {
        fmt::MemoryWriter writer;
        std::for_each(begin, end, [&writer](const auto value) { writer << value << "\n"; });
        return writer;
    }

    // A modified version of the original approach.
    template <typename Container>
    void original_approach(const Container &data, const std::string &fileName) {
        std::ofstream fout(fileName);
        for (size_t l = 0; l < data.size(); l++) {
            fout << data[l] << std::endl;
        }
        fout.close();
    }

    // Replace std::endl by "\n"
    template <typename Iterator>
    void improved_original_approach(Iterator begin, Iterator end, const std::string &fileName) {
        std::ofstream fout(fileName);
        const size_t len = std::distance(begin, end) * LEN;
        std::vector<char> buffer(len);
        fout.rdbuf()->pubsetbuf(buffer.data(), len);
        for (Iterator it = begin; it != end; ++it) {
            fout << *it << "\n";
        }
        fout.close();
    }

    //
    template <typename Iterator>
    void edgar_rokyan_solution(Iterator begin, Iterator end, const std::string &fileName) {
        std::ofstream fout(fileName);
        std::copy(begin, end, std::ostream_iterator<double>(fout, "\n"));
    }

    // Cache to a string stream before writing to the output file
    template <typename Iterator>
    void stringstream_approach(Iterator begin, Iterator end, const std::string &fileName) {
        std::stringstream buffer;
        for (Iterator it = begin; it != end; ++it) {
            buffer << *it << "\n";
        }

        // Now write to the output file.
        std::ofstream fout(fileName);
        fout << buffer.str();
        fout.close();
    }

    // Use sprintf
    template <typename Iterator>
    void sprintf_approach(Iterator begin, Iterator end, const std::string &fileName) {
        std::stringstream buffer;
        toStringStream(begin, end, buffer);
        std::ofstream fout(fileName);
        fout << buffer.str();
        fout.close();
    }

    // Use fmt::MemoryWriter (https://github.com/fmtlib/fmt)
    template <typename Iterator>
    void fmt_approach(Iterator begin, Iterator end, const std::string &fileName) {
        auto writer = toMemoryWriter(begin, end);
        std::ofstream fout(fileName);
        fout << writer.str();
        fout.close();
    }

    // Use std::vector<char>
    template <typename Iterator>
    void vector_of_char_approach(Iterator begin, Iterator end, const std::string &fileName) {
        std::vector<char> buffer = toVectorOfChar(begin, end);
        std::ofstream fout(fileName);
        fout << buffer.data();
        fout.close();
    }

    // Use cereal (http://uscilab.github.io/cereal/).
    template <typename Container, typename OArchive = cereal::BinaryOutputArchive>
    void use_cereal(Container &&data, const std::string &fileName) {
        std::stringstream buffer;
        {
            OArchive oar(buffer);
            oar(data);
        }

        std::ofstream fout(fileName);
        fout << buffer.str();
        fout.close();
    }
}

// Performance test input data.
constexpr int NumberOfSamples = 5;
constexpr int NumberOfIterations = 2;
constexpr int N = 3000000;
const auto double_data = create_test_data<double>(N);
const auto float_data = create_test_data<float>(N);
const auto int_data = create_test_data<int>(N);
const auto size_t_data = create_test_data<size_t>(N);

CELERO_MAIN

BASELINE(DoubleVector, original_approach, NumberOfSamples, NumberOfIterations) {
    const std::string fileName("origsol.txt");
    original_approach(double_data, fileName);
}

BENCHMARK(DoubleVector, improved_original_approach, NumberOfSamples, NumberOfIterations) {
    const std::string fileName("improvedsol.txt");
    improved_original_approach(double_data.cbegin(), double_data.cend(), fileName);
}

BENCHMARK(DoubleVector, edgar_rokyan_solution, NumberOfSamples, NumberOfIterations) {
    const std::string fileName("edgar_rokyan_solution.txt");
    edgar_rokyan_solution(double_data.cbegin(), double_data.end(), fileName);
}

BENCHMARK(DoubleVector, stringstream_approach, NumberOfSamples, NumberOfIterations) {
    const std::string fileName("stringstream.txt");
    stringstream_approach(double_data.cbegin(), double_data.cend(), fileName);
}

BENCHMARK(DoubleVector, sprintf_approach, NumberOfSamples, NumberOfIterations) {
    const std::string fileName("sprintf.txt");
    sprintf_approach(double_data.cbegin(), double_data.cend(), fileName);
}

BENCHMARK(DoubleVector, fmt_approach, NumberOfSamples, NumberOfIterations) {
    const std::string fileName("fmt.txt");
    fmt_approach(double_data.cbegin(), double_data.cend(), fileName);
}

BENCHMARK(DoubleVector, vector_of_char_approach, NumberOfSamples, NumberOfIterations) {
    const std::string fileName("vector_of_char.txt");
    vector_of_char_approach(double_data.cbegin(), double_data.cend(), fileName);
}

BENCHMARK(DoubleVector, use_cereal, NumberOfSamples, NumberOfIterations) {
    const std::string fileName("cereal.bin");
    use_cereal(double_data, fileName);
}

// Benchmark double vector
BASELINE(DoubleVectorConversion, toStringStream, NumberOfSamples, NumberOfIterations) {
    std::stringstream output;
    toStringStream(double_data.cbegin(), double_data.cend(), output);
}

BENCHMARK(DoubleVectorConversion, toMemoryWriter, NumberOfSamples, NumberOfIterations) {
    celero::DoNotOptimizeAway(toMemoryWriter(double_data.cbegin(), double_data.cend()));
}

BENCHMARK(DoubleVectorConversion, toVectorOfChar, NumberOfSamples, NumberOfIterations) {
    celero::DoNotOptimizeAway(toVectorOfChar(double_data.cbegin(), double_data.cend()));
}

// Benchmark float vector
BASELINE(FloatVectorConversion, toStringStream, NumberOfSamples, NumberOfIterations) {
    std::stringstream output;
    toStringStream(float_data.cbegin(), float_data.cend(), output);
}

BENCHMARK(FloatVectorConversion, toMemoryWriter, NumberOfSamples, NumberOfIterations) {
    celero::DoNotOptimizeAway(toMemoryWriter(float_data.cbegin(), float_data.cend()));
}

BENCHMARK(FloatVectorConversion, toVectorOfChar, NumberOfSamples, NumberOfIterations) {
    celero::DoNotOptimizeAway(toVectorOfChar(float_data.cbegin(), float_data.cend()));
}

// Benchmark int vector
BASELINE(int_conversion, toStringStream, NumberOfSamples, NumberOfIterations) {
    std::stringstream output;
    toStringStream(int_data.cbegin(), int_data.cend(), output);
}

BENCHMARK(int_conversion, toMemoryWriter, NumberOfSamples, NumberOfIterations) {
    celero::DoNotOptimizeAway(toMemoryWriter(int_data.cbegin(), int_data.cend()));
}

BENCHMARK(int_conversion, toVectorOfChar, NumberOfSamples, NumberOfIterations) {
    celero::DoNotOptimizeAway(toVectorOfChar(int_data.cbegin(), int_data.cend()));
}

// Benchmark size_t vector
BASELINE(size_t_conversion, toStringStream, NumberOfSamples, NumberOfIterations) {
    std::stringstream output;
    toStringStream(size_t_data.cbegin(), size_t_data.cend(), output);
}

BENCHMARK(size_t_conversion, toMemoryWriter, NumberOfSamples, NumberOfIterations) {
    celero::DoNotOptimizeAway(toMemoryWriter(size_t_data.cbegin(), size_t_data.cend()));
}

BENCHMARK(size_t_conversion, toVectorOfChar, NumberOfSamples, NumberOfIterations) {
    celero::DoNotOptimizeAway(toVectorOfChar(size_t_data.cbegin(), size_t_data.cend()));
}

다음은 clang-3.9.1 및 -O3 플래그를 사용하여 내 Linux 상자에서 얻은 성능 결과입니다. Celero사용 하여 모든 성능 결과를 수집합니다.

Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
DoubleVector    | original_approa | Null            |              10 |               4 |         1.00000 |   3650309.00000 |            0.27 | 
DoubleVector    | improved_origin | Null            |              10 |               4 |         0.47828 |   1745855.00000 |            0.57 | 
DoubleVector    | edgar_rokyan_so | Null            |              10 |               4 |         0.45804 |   1672005.00000 |            0.60 | 
DoubleVector    | stringstream_ap | Null            |              10 |               4 |         0.41514 |   1515377.00000 |            0.66 | 
DoubleVector    | sprintf_approac | Null            |              10 |               4 |         0.35436 |   1293521.50000 |            0.77 | 
DoubleVector    | fmt_approach    | Null            |              10 |               4 |         0.34916 |   1274552.75000 |            0.78 | 
DoubleVector    | vector_of_char_ | Null            |              10 |               4 |         0.34366 |   1254462.00000 |            0.80 | 
DoubleVector    | use_cereal      | Null            |              10 |               4 |         0.04172 |    152291.25000 |            6.57 | 
Complete.

I also benchmark for numeric to string conversion algorithms to compare the performance of std::stringstream, fmt::MemoryWriter, and std::vector.

Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
DoubleVectorCon | toStringStream  | Null            |              10 |               4 |         1.00000 |   1272667.00000 |            0.79 | 
FloatVectorConv | toStringStream  | Null            |              10 |               4 |         1.00000 |   1272573.75000 |            0.79 | 
int_conversion  | toStringStream  | Null            |              10 |               4 |         1.00000 |    248709.00000 |            4.02 | 
size_t_conversi | toStringStream  | Null            |              10 |               4 |         1.00000 |    252063.00000 |            3.97 | 
DoubleVectorCon | toMemoryWriter  | Null            |              10 |               4 |         0.98468 |   1253165.50000 |            0.80 | 
DoubleVectorCon | toVectorOfChar  | Null            |              10 |               4 |         0.97146 |   1236340.50000 |            0.81 | 
FloatVectorConv | toMemoryWriter  | Null            |              10 |               4 |         0.98419 |   1252454.25000 |            0.80 | 
FloatVectorConv | toVectorOfChar  | Null            |              10 |               4 |         0.97369 |   1239093.25000 |            0.81 | 
int_conversion  | toMemoryWriter  | Null            |              10 |               4 |         0.11741 |     29200.50000 |           34.25 | 
int_conversion  | toVectorOfChar  | Null            |              10 |               4 |         0.87105 |    216637.00000 |            4.62 | 
size_t_conversi | toMemoryWriter  | Null            |              10 |               4 |         0.13746 |     34649.50000 |           28.86 | 
size_t_conversi | toVectorOfChar  | Null            |              10 |               4 |         0.85345 |    215123.00000 |            4.65 | 
Complete.

From the above tables we can see that:

  1. Edgar Rokyan solution is 10% slower than the stringstream solution. The solution that use fmt library is the best for three studied data types which are double, int, and size_t. sprintf + std::vector solution is 1% faster than the fmt solution for double data type. However, I do not recommend solutions that use sprintf for production code because they are not elegant (still written in C style) and do not work out of the box for different data types such as int or size_t.

  2. The benchmark results also show that fmt is the superrior integral data type serialization since it is at least 7x faster than other approaches.

  3. We can speed up this algorithm 10x if we use the binary format. This approach is significantly faster than writing to a formatted text file because we only do raw copy from the memory to the output. If you want to have more flexible and portable solutions then try cereal or boost::serialization or protocol-buffer. According to this performance study cereal seem to be the fastest.


std::ofstream fout("vector.txt");
fout << std::setprecision(10);

for(auto const& x : vector)
    fout << x << '\n';

Everything I changed had theoretically worse performance in your version of the code, but the std::endl was the real killer. std::vector::at (with bounds checking, which you don't need) would be the second, then the fact that you did not use iterators.

Why default-construct a std::ofstream and then call open, when you can do it in one step? Why call close when RAII (the destructor) takes care of it for you? You can also call

fout << std::setprecision(10)

just once, before the loop.

As noted in the comment below, if your vector is of elements of fundamental type, you might get a better performance with for(auto x : vector). Measure the running time / inspect the assembly output.


Just to point out another thing that caught my eyes, this:

for(l = 0; l < vector.size(); l++)

What is this l? Why declare it outside the loop? It seems you don't need it in the outer scope, so don't. And also the post-increment.

The result:

for(size_t l = 0; l < vector.size(); ++l)

I'm sorry for making code review out of this post.


You can also use a rather neat form of outputting contents of any vector into the file, with a help of iterators and copy function.

std::ofstream fout("vector.txt");
fout.precision(10);

std::copy(numbers.begin(), numbers.end(),
    std::ostream_iterator<double>(fout, "\n"));

This solutions is practically the same with LogicStuff's solution in terms of execution time. But it also illustrates how to print the contents just with a single copy function which, as I suppose, looks pretty well.


OK, I'm sad that there are three solutions that attempt to give you a fish, but no solution that attempts to teach you how to fish.

When you have a performance problem, the solution is to use a profiler, and fix whatever the problem the profiler shows.

Converting double-to-string for 300,000 doubles will not take 3 minutes on any computer that has shipped in the last 10 years.

Writing 3 MB of data to disk (an average size of 300,000 doubles) will not take 3 minutes on any computer that has shipped in the last 10 years.

If you profile this, my guess is that you'll find that fout gets flushed 300,000 times, and that flushing is slow, because it may involve blocking, or semi-blocking, I/O. Thus, you need to avoid the blocking I/O. The typical way of doing that is to prepare all your I/O to a single buffer (create a stringstream, write to that) and then write that buffer to a physical file in one go. This is the solution hungptit describes, except I think that what's missing is explaining WHY that solution is a good solution.

Or, to put it another way: What the profiler will tell you is that calling write() (on Linux) or WriteFile() (on Windows) is much slower than just copying a few bytes into a memory buffer, because it's a user/kernel level transition. If std::endl causes this to happen for each double, you're going to have a bad (slow) time. Replace it with something that just stays in user space and puts data in RAM!

If that's still not fast enough, it may be that the specific-precision version of operator<<() on strings is slow or involves unnecessary overhead. If so, you may be able to further speed up the code by using sprintf() or some other potentially faster function to generate data into the in-memory buffer, before you finally write the entire buffer to a file in one go.


You have two main bottlenecks in your program: output and formatting text.

To increase performance, you will want to increase the amount of data output per call. For example, 1 output transfer of 500 characters is faster than 500 transfers of 1 character.

My recommendation is you format the data to a big buffer, then block write the buffer.

Here's an example:

char buffer[1024 * 1024];
unsigned int buffer_index = 0;
const unsigned int size = my_vector.size();
for (unsigned int i = 0; i < size; ++i)
{
  signed int characters_formatted = snprintf(&buffer[buffer_index],
                                             (1024 * 1024) - buffer_index,
                                             "%.10f", my_vector[i]);
  if (characters_formatted > 0)
  {
      buffer_index += (unsigned int) characters_formatted;
  }
}
cout.write(&buffer[0], buffer_index);

You should first try changing optimization settings in your compiler before messing with the code.


Here is a slightly different solution: save your doubles in binary form.

int fd = ::open("/path/to/the/file", O_WRONLY /* whatever permission */);
::write(fd, &vector[0], vector.size() * sizeof(vector[0]));

Since you mentioned that you have 300k doubles, which equals to 300k * 8 bytes = 2.4M, you can save all of them to local disk file in less than 0.1 second. The only drawback of this method is saved file is not as readable as string representation, but a HexEditor can solve that problem.

If you prefer more robust way, there are plenty of serialization libraries/tools available on line. They provide more benefits, such as language-neutral, machine-independent, flexible compression algorithm, etc. Those are the two I usually use:

ReferenceURL : https://stackoverflow.com/questions/39753721/fast-way-to-write-data-from-a-stdvector-to-a-text-file

반응형