Merge "release 1.4.0"
Jenkins authored 7 years ago
Gerrit Code Review committed 7 years ago
5 | 5 | Contributors |
6 | 6 | ------------ |
7 | 7 | Timur Alperovich (timuralp@swiftstack.com) |
8 | Tim Burke (tim.burke@gmail.com) | |
8 | 9 | Thiago da Silva (thiago@redhat.com) |
9 | 10 | Eric Lambert (eric_lambert@xyratex.com) |
10 | 11 | Ondřej Nový (ondrej.novy@firma.seznam.cz) |
0 | New in 1.4.0 | |
1 | ------------ | |
2 | ||
3 | * Add support for ISA-L Cauchy | |
4 | * Fixed memory leak in get_metadata | |
5 | * Added soft warning log line when using liberasurecode <1.3.1 | |
6 | ||
0 | 7 | New in 1.3.1 |
1 | 8 | ------------ |
2 | 9 |
0 | This library provides a simple Python interface for implementing erasure codes | |
1 | and is known to work with Python v2.6, 2.7 and 3.x. | |
2 | ||
3 | To obtain the best possible performance, the library utilizes liberasurecode, | |
4 | which is a C based erasure code library. Please let us know if you have any | |
5 | issues building or installing (email: kmgreen2@gmail.com or tusharsg@gmail.com). | |
6 | ||
7 | PyECLib supports a variety of Erasure Coding backends including the standard Reed | |
8 | Soloman implementations provided by Jerasure [2], liberasurecode [3] and Intel | |
9 | ISA-L [4]. It also provides support for a flat XOR-based encoder and decoder | |
10 | (part of liberasurecode) - a class of HD Combination Codes based on "Flat | |
11 | XOR-based erasure codes in storage systems: Constructions, efficient recovery, | |
12 | and tradeoffs" in IEEE MSST 2010). These codes are well-suited to archival | |
13 | use-cases, have a simple construction and require a minimum number of | |
14 | participating disks during single-disk reconstruction (think XOR-based LRC code). | |
15 | ||
16 | Examples of using PyECLib are provided in the "tools" directory: | |
17 | ||
18 | Command-line encoder:: | |
19 | ||
20 | tools/pyeclib_encode.py | |
21 | ||
22 | Command-line decoder:: | |
23 | ||
24 | tools/pyeclib_decode.py | |
25 | ||
26 | Utility to determine what is needed to reconstruct missing fragments:: | |
27 | ||
28 | tools/pyeclib_fragments_needed.py | |
29 | ||
30 | ||
31 | PyEClib initialization:: | |
32 | ||
33 | ec_driver = ECDriver(k=<num_encoded_data_fragments>, | |
34 | m=<num_encoded_parity_fragments>, | |
35 | ec_type=<ec_scheme>)) | |
36 | ||
37 | Supported ``ec_type`` values: | |
38 | ||
39 | * ``liberasurecode_rs_vand`` => Vandermonde Reed-Solomon encoding, software-only backend implemented by liberasurecode [3] | |
40 | * ``jerasure_rs_vand`` => Vandermonde Reed-Solomon encoding, based on Jerasure [1] | |
41 | * ``jerasure_rs_cauchy`` => Cauchy Reed-Solomon encoding (Jerasure variant), based on Jerasure [2] | |
42 | * ``flat_xor_hd_3``, ``flat_xor_hd_4`` => Flat-XOR based HD combination codes, liberasurecode [3] | |
43 | * ``isa_l_rs_vand`` => Intel Storage Acceleration Library (ISA-L) - SIMD accelerated Erasure Coding backends [4] | |
44 | * ``shss`` => NTT Lab Japan's Erasure Coding Library | |
45 | ||
46 | A configuration utility is provided to help compare available EC schemes in | |
47 | terms of performance and redundancy:: `tools/pyeclib_conf_tool.py` | |
48 | ||
49 | ||
50 | The Python API supports the following functions: | |
51 | ||
52 | - EC Encode | |
53 | ||
54 | Encode N bytes of a data object into k (data) + m (parity) fragments:: | |
55 | ||
56 | def encode(self, data_bytes) | |
57 | ||
58 | input: data_bytes - input data object (bytes) | |
59 | returns: list of fragments (bytes) | |
60 | throws: | |
61 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
62 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
63 | ECInvalidParameter - if invalid parameters were provided | |
64 | ECOutOfMemory - if the process has run out of memory | |
65 | ECDriverError - if an unknown error occurs | |
66 | ||
67 | - EC Decode | |
68 | ||
69 | Decode between k and k+m fragments into original object:: | |
70 | ||
71 | def decode(self, fragment_payloads) | |
72 | ||
73 | input: list of fragment_payloads (bytes) | |
74 | returns: decoded object (bytes) | |
75 | throws: | |
76 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
77 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
78 | ECInvalidParameter - if invalid parameters were provided | |
79 | ECOutOfMemory - if the process has run out of memory | |
80 | ECInsufficientFragments - if an insufficient set of fragments has been provided (e.g. not enough) | |
81 | ECInvalidFragmentMetadata - if the fragment headers appear to be corrupted | |
82 | ECDriverError - if an unknown error occurs | |
83 | ||
84 | ||
85 | *Note*: ``bytes`` is a synonym to ``str`` in Python 2.6, 2.7. | |
86 | In Python 3.x, ``bytes`` and ``str`` types are non-interchangeable and care | |
87 | needs to be taken when handling input to and output from the ``encode()`` and | |
88 | ``decode()`` routines. | |
89 | ||
90 | ||
91 | - EC Reconstruct | |
92 | ||
93 | Reconstruct "missing_fragment_indexes" using "available_fragment_payloads":: | |
94 | ||
95 | def reconstruct(self, available_fragment_payloads, missing_fragment_indexes) | |
96 | ||
97 | input: available_fragment_payloads - list of fragment payloads | |
98 | input: missing_fragment_indexes - list of indexes to reconstruct | |
99 | output: list of reconstructed fragments corresponding to missing_fragment_indexes | |
100 | throws: | |
101 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
102 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
103 | ECInvalidParameter - if invalid parameters were provided | |
104 | ECOutOfMemory - if the process has run out of memory | |
105 | ECInsufficientFragments - if an insufficient set of fragments has been provided (e.g. not enough) | |
106 | ECInvalidFragmentMetadata - if the fragment headers appear to be corrupted | |
107 | ECDriverError - if an unknown error occurs | |
108 | ||
109 | ||
110 | - Minimum parity fragments needed for durability gurantees:: | |
111 | ||
112 | def min_parity_fragments_needed(self) | |
113 | ||
114 | NOTE: Currently hard-coded to 1, so this can only be trusted for MDS codes, such as | |
115 | Reed-Solomon. | |
116 | ||
117 | output: minimum number of additional fragments needed to be synchronously written to tolerate | |
118 | the loss of any one fragment (similar guarantees to 2 out of 3 with 3x replication) | |
119 | throws: | |
120 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
121 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
122 | ECInvalidParameter - if invalid parameters were provided | |
123 | ECOutOfMemory - if the process has run out of memory | |
124 | ECDriverError - if an unknown error occurs | |
125 | ||
126 | ||
127 | - Fragments needed for EC Reconstruct | |
128 | ||
129 | Return the indexes of fragments needed to reconstruct "missing_fragment_indexes":: | |
130 | ||
131 | def fragments_needed(self, missing_fragment_indexes) | |
132 | ||
133 | input: list of missing_fragment_indexes | |
134 | output: list of fragments needed to reconstruct fragments listed in missing_fragment_indexes | |
135 | throws: | |
136 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
137 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
138 | ECInvalidParameter - if invalid parameters were provided | |
139 | ECOutOfMemory - if the process has run out of memory | |
140 | ECDriverError - if an unknown error occurs | |
141 | ||
142 | ||
143 | - Get EC Metadata | |
144 | ||
145 | Return an opaque header known by the underlying library or a formatted header (Python dict):: | |
146 | ||
147 | def get_metadata(self, fragment, formatted = 0) | |
148 | ||
149 | input: raw fragment payload | |
150 | input: boolean specifying if returned header is opaque buffer or formatted string | |
151 | output: fragment header (opaque or formatted) | |
152 | throws: | |
153 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
154 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
155 | ECInvalidParameter - if invalid parameters were provided | |
156 | ECOutOfMemory - if the process has run out of memory | |
157 | ECDriverError - if an unknown error occurs | |
158 | ||
159 | - Verify EC Stripe Consistency | |
160 | ||
161 | Use opaque buffers from get_metadata() to verify a the consistency of a stripe:: | |
162 | ||
163 | def verify_stripe_metadata(self, fragment_metadata_list) | |
164 | ||
165 | intput: list of opaque fragment headers | |
166 | output: formatted string containing the 'status' (0 is success) and 'reason' if verification fails | |
167 | throws: | |
168 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
169 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
170 | ECInvalidParameter - if invalid parameters were provided | |
171 | ECOutOfMemory - if the process has run out of memory | |
172 | ECDriverError - if an unknown error occurs | |
173 | ||
174 | ||
175 | - Get EC Segment Info | |
176 | ||
177 | Return a dict with the keys - segment_size, last_segment_size, fragment_size, last_fragment_size and num_segments:: | |
178 | ||
179 | def get_segment_info(self, data_len, segment_size) | |
180 | ||
181 | input: total data_len of the object to store | |
182 | input: target segment size used to segment the object into multiple EC stripes | |
183 | output: a dict with keys - segment_size, last_segment_size, fragment_size, last_fragment_size and num_segments | |
184 | throws: | |
185 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
186 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
187 | ECInvalidParameter - if invalid parameters were provided | |
188 | ECOutOfMemory - if the process has run out of memory | |
189 | ECDriverError - if an unknown error occurs | |
190 | ||
191 | ||
192 | - Get EC Segment Info given a list of ranges, data length and segment size:: | |
193 | ||
194 | def get_segment_info_byterange(self, ranges, data_len, segment_size) | |
195 | ||
196 | input: byte ranges | |
197 | input: total data_len of the object to store | |
198 | input: target segment size used to segment the object into multiple EC stripes | |
199 | output: (see below) | |
200 | throws: | |
201 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
202 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
203 | ECInvalidParameter - if invalid parameters were provided | |
204 | ECOutOfMemory - if the process has run out of memory | |
205 | ECDriverError - if an unknown error occurs | |
206 | ||
207 | Assume a range request is given for an object with segment size 3K and | |
208 | a 1 MB file:: | |
209 | ||
210 | Ranges = (0, 1), (1, 12), (10, 1000), (0, segment_size-1), | |
211 | (1, segment_size+1), (segment_size-1, 2*segment_size) | |
212 | ||
213 | This will return a map keyed on the ranges, where there is a recipe | |
214 | given for each range:: | |
215 | ||
216 | { | |
217 | (0, 1): {0: (0, 1)}, | |
218 | (10, 1000): {0: (10, 1000)}, | |
219 | (1, 12): {0: (1, 12)}, | |
220 | (0, 3071): {0: (0, 3071)}, | |
221 | (3071, 6144): {0: (3071, 3071), 1: (0, 3071), 2: (0, 0)}, | |
222 | (1, 3073): {0: (1, 3071), 1: (0,0)} | |
223 | } | |
224 | ||
225 | ||
226 | Quick Start | |
227 | ||
228 | Install pre-requisites: | |
229 | ||
230 | * Python 2.6, 2.7 or 3.x (including development packages), argparse, setuptools | |
231 | * liberasurecode v1.2.0 or greater [3] | |
232 | * Erasure code backend libraries, gf-complete and Jerasure [1],[2], ISA-L [4] etc | |
233 | ||
234 | An example for ubuntu to install dependency packages: | |
235 | $ sudo apt-get install build-essential python-dev python-pip liberasurecode-dev | |
236 | $ sudo pip install -U bindep -r test-requirement.txt | |
237 | ||
238 | If you want to confirm all dependency packages installed succuessfully, try: | |
239 | $ sudo bindep -f bindep.txt | |
240 | ||
241 | That shows missing dependency packages for you, http://docs.openstack.org/infra/bindep/ | |
242 | ||
243 | *Note*: currently liberasurecode-dev/liberasurecode-devel in package repo is older | |
244 | than v1.2.0 | |
245 | ||
246 | Install PyECLib:: | |
247 | $ sudo python setup.py install | |
248 | ||
249 | Run test suite included:: | |
250 | ||
251 | $ ./.unittests | |
252 | ||
253 | If all of this works, then you should be good to go. If not, send us an email! | |
254 | ||
255 | If the test suite fails because it cannot find any of the shared libraries, | |
256 | then you probably need to add /usr/local/lib to the path searched when loading | |
257 | libraries. The best way to do this (on Linux) is to add '/usr/local/lib' to:: | |
258 | ||
259 | /etc/ld.so.conf | |
260 | ||
261 | and then make sure to run:: | |
262 | ||
263 | $ sudo ldconfig | |
264 | ||
265 | ||
266 | References | |
267 | ||
268 | [1] Jerasure, C library that supports erasure coding in storage applications, http://jerasure.org | |
269 | ||
270 | [2] Greenan, Kevin M et al, "Flat XOR-based erasure codes in storage systems", http://www.kaymgee.com/Kevin_Greenan/Publications_files/greenan-msst10.pdf | |
271 | ||
272 | [3] liberasurecode, C API abstraction layer for erasure coding backends, https://github.com/openstack/liberasurecode | |
273 | ||
274 | [4] Intel(R) Storage Acceleration Library (Open Source Version), https://01.org/intel%C2%AE-storage-acceleration-library-open-source-version | |
275 | ||
276 | [5] Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>, "NTT SHSS Erasure Coding backend" |
0 | PyEClib | |
1 | ------- | |
2 | ||
3 | This library provides a simple Python interface for implementing erasure codes | |
4 | and is known to work with Python v2.6, 2.7 and 3.x. | |
5 | ||
6 | To obtain the best possible performance, the library utilizes liberasurecode, | |
7 | which is a C based erasure code library. Please let us know if you have any | |
8 | issues building or installing (email: kmgreen2@gmail.com or tusharsg@gmail.com). | |
9 | ||
10 | PyECLib supports a variety of Erasure Coding backends including the standard Reed | |
11 | Soloman implementations provided by Jerasure [1], liberasurecode [3] and Intel | |
12 | ISA-L [4]. It also provides support for a flat XOR-based encoder and decoder | |
13 | (part of liberasurecode) - a class of HD Combination Codes based on "Flat | |
14 | XOR-based erasure codes in storage systems: Constructions, efficient recovery, | |
15 | and tradeoffs" in IEEE MSST 2010[2]). These codes are well-suited to archival | |
16 | use-cases, have a simple construction and require a minimum number of | |
17 | participating disks during single-disk reconstruction (think XOR-based LRC code). | |
18 | ||
19 | Examples of using PyECLib are provided in the "tools" directory: | |
20 | ||
21 | Command-line encoder:: | |
22 | ||
23 | tools/pyeclib_encode.py | |
24 | ||
25 | Command-line decoder:: | |
26 | ||
27 | tools/pyeclib_decode.py | |
28 | ||
29 | Utility to determine what is needed to reconstruct missing fragments:: | |
30 | ||
31 | tools/pyeclib_fragments_needed.py | |
32 | ||
33 | A configuration utility to help compare available EC schemes in terms of | |
34 | performance and redundancy:: | |
35 | ||
36 | tools/pyeclib_conf_tool.py | |
37 | ||
38 | PyEClib initialization:: | |
39 | ||
40 | ec_driver = ECDriver(k=<num_encoded_data_fragments>, | |
41 | m=<num_encoded_parity_fragments>, | |
42 | ec_type=<ec_scheme>)) | |
43 | ||
44 | Supported ``ec_type`` values: | |
45 | ||
46 | * ``liberasurecode_rs_vand`` => Vandermonde Reed-Solomon encoding, software-only backend implemented by liberasurecode [3] | |
47 | * ``jerasure_rs_vand`` => Vandermonde Reed-Solomon encoding, based on Jerasure [1] | |
48 | * ``jerasure_rs_cauchy`` => Cauchy Reed-Solomon encoding (Jerasure variant), based on Jerasure [1] | |
49 | * ``flat_xor_hd_3``, ``flat_xor_hd_4`` => Flat-XOR based HD combination codes, liberasurecode [3] | |
50 | * ``isa_l_rs_vand`` => Intel Storage Acceleration Library (ISA-L) - SIMD accelerated Erasure Coding backends [4] | |
51 | * ``isa_l_rs_cauchy`` => Cauchy Reed-Solomon encoding (ISA-L variant) [4] | |
52 | * ``shss`` => NTT Lab Japan's Erasure Coding Library [5] | |
53 | ||
54 | ||
55 | The Python API supports the following functions: | |
56 | ||
57 | - EC Encode | |
58 | ||
59 | Encode N bytes of a data object into k (data) + m (parity) fragments:: | |
60 | ||
61 | def encode(self, data_bytes) | |
62 | ||
63 | input: data_bytes - input data object (bytes) | |
64 | returns: list of fragments (bytes) | |
65 | throws: | |
66 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
67 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
68 | ECInvalidParameter - if invalid parameters were provided | |
69 | ECOutOfMemory - if the process has run out of memory | |
70 | ECDriverError - if an unknown error occurs | |
71 | ||
72 | - EC Decode | |
73 | ||
74 | Decode between k and k+m fragments into original object:: | |
75 | ||
76 | def decode(self, fragment_payloads) | |
77 | ||
78 | input: list of fragment_payloads (bytes) | |
79 | returns: decoded object (bytes) | |
80 | throws: | |
81 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
82 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
83 | ECInvalidParameter - if invalid parameters were provided | |
84 | ECOutOfMemory - if the process has run out of memory | |
85 | ECInsufficientFragments - if an insufficient set of fragments has been provided (e.g. not enough) | |
86 | ECInvalidFragmentMetadata - if the fragment headers appear to be corrupted | |
87 | ECDriverError - if an unknown error occurs | |
88 | ||
89 | ||
90 | *Note*: ``bytes`` is a synonym to ``str`` in Python 2.6, 2.7. | |
91 | In Python 3.x, ``bytes`` and ``str`` types are non-interchangeable and care | |
92 | needs to be taken when handling input to and output from the ``encode()`` and | |
93 | ``decode()`` routines. | |
94 | ||
95 | ||
96 | - EC Reconstruct | |
97 | ||
98 | Reconstruct "missing_fragment_indexes" using "available_fragment_payloads":: | |
99 | ||
100 | def reconstruct(self, available_fragment_payloads, missing_fragment_indexes) | |
101 | ||
102 | input: available_fragment_payloads - list of fragment payloads | |
103 | input: missing_fragment_indexes - list of indexes to reconstruct | |
104 | output: list of reconstructed fragments corresponding to missing_fragment_indexes | |
105 | throws: | |
106 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
107 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
108 | ECInvalidParameter - if invalid parameters were provided | |
109 | ECOutOfMemory - if the process has run out of memory | |
110 | ECInsufficientFragments - if an insufficient set of fragments has been provided (e.g. not enough) | |
111 | ECInvalidFragmentMetadata - if the fragment headers appear to be corrupted | |
112 | ECDriverError - if an unknown error occurs | |
113 | ||
114 | ||
115 | - Minimum parity fragments needed for durability gurantees:: | |
116 | ||
117 | def min_parity_fragments_needed(self) | |
118 | ||
119 | NOTE: Currently hard-coded to 1, so this can only be trusted for MDS codes, such as | |
120 | Reed-Solomon. | |
121 | ||
122 | output: minimum number of additional fragments needed to be synchronously written to tolerate | |
123 | the loss of any one fragment (similar guarantees to 2 out of 3 with 3x replication) | |
124 | throws: | |
125 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
126 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
127 | ECInvalidParameter - if invalid parameters were provided | |
128 | ECOutOfMemory - if the process has run out of memory | |
129 | ECDriverError - if an unknown error occurs | |
130 | ||
131 | ||
132 | - Fragments needed for EC Reconstruct | |
133 | ||
134 | Return the indexes of fragments needed to reconstruct "missing_fragment_indexes":: | |
135 | ||
136 | def fragments_needed(self, missing_fragment_indexes) | |
137 | ||
138 | input: list of missing_fragment_indexes | |
139 | output: list of fragments needed to reconstruct fragments listed in missing_fragment_indexes | |
140 | throws: | |
141 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
142 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
143 | ECInvalidParameter - if invalid parameters were provided | |
144 | ECOutOfMemory - if the process has run out of memory | |
145 | ECDriverError - if an unknown error occurs | |
146 | ||
147 | ||
148 | - Get EC Metadata | |
149 | ||
150 | Return an opaque header known by the underlying library or a formatted header (Python dict):: | |
151 | ||
152 | def get_metadata(self, fragment, formatted = 0) | |
153 | ||
154 | input: raw fragment payload | |
155 | input: boolean specifying if returned header is opaque buffer or formatted string | |
156 | output: fragment header (opaque or formatted) | |
157 | throws: | |
158 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
159 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
160 | ECInvalidParameter - if invalid parameters were provided | |
161 | ECOutOfMemory - if the process has run out of memory | |
162 | ECDriverError - if an unknown error occurs | |
163 | ||
164 | - Verify EC Stripe Consistency | |
165 | ||
166 | Use opaque buffers from get_metadata() to verify a the consistency of a stripe:: | |
167 | ||
168 | def verify_stripe_metadata(self, fragment_metadata_list) | |
169 | ||
170 | intput: list of opaque fragment headers | |
171 | output: formatted string containing the 'status' (0 is success) and 'reason' if verification fails | |
172 | throws: | |
173 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
174 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
175 | ECInvalidParameter - if invalid parameters were provided | |
176 | ECOutOfMemory - if the process has run out of memory | |
177 | ECDriverError - if an unknown error occurs | |
178 | ||
179 | ||
180 | - Get EC Segment Info | |
181 | ||
182 | Return a dict with the keys - segment_size, last_segment_size, fragment_size, last_fragment_size and num_segments:: | |
183 | ||
184 | def get_segment_info(self, data_len, segment_size) | |
185 | ||
186 | input: total data_len of the object to store | |
187 | input: target segment size used to segment the object into multiple EC stripes | |
188 | output: a dict with keys - segment_size, last_segment_size, fragment_size, last_fragment_size and num_segments | |
189 | throws: | |
190 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
191 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
192 | ECInvalidParameter - if invalid parameters were provided | |
193 | ECOutOfMemory - if the process has run out of memory | |
194 | ECDriverError - if an unknown error occurs | |
195 | ||
196 | ||
197 | - Get EC Segment Info given a list of ranges, data length and segment size:: | |
198 | ||
199 | def get_segment_info_byterange(self, ranges, data_len, segment_size) | |
200 | ||
201 | input: byte ranges | |
202 | input: total data_len of the object to store | |
203 | input: target segment size used to segment the object into multiple EC stripes | |
204 | output: (see below) | |
205 | throws: | |
206 | ECBackendInstanceNotAvailable - if the backend library cannot be found | |
207 | ECBackendNotSupported - if the backend is not supported by PyECLib (see ec_types above) | |
208 | ECInvalidParameter - if invalid parameters were provided | |
209 | ECOutOfMemory - if the process has run out of memory | |
210 | ECDriverError - if an unknown error occurs | |
211 | ||
212 | Assume a range request is given for an object with segment size 3K and | |
213 | a 1 MB file:: | |
214 | ||
215 | Ranges = (0, 1), (1, 12), (10, 1000), (0, segment_size-1), | |
216 | (1, segment_size+1), (segment_size-1, 2*segment_size) | |
217 | ||
218 | This will return a map keyed on the ranges, where there is a recipe | |
219 | given for each range:: | |
220 | ||
221 | { | |
222 | (0, 1): {0: (0, 1)}, | |
223 | (10, 1000): {0: (10, 1000)}, | |
224 | (1, 12): {0: (1, 12)}, | |
225 | (0, 3071): {0: (0, 3071)}, | |
226 | (3071, 6144): {0: (3071, 3071), 1: (0, 3071), 2: (0, 0)}, | |
227 | (1, 3073): {0: (1, 3071), 1: (0,0)} | |
228 | } | |
229 | ||
230 | ||
231 | Quick Start | |
232 | ||
233 | Install pre-requisites:: | |
234 | ||
235 | * Python 2.6, 2.7 or 3.x (including development packages), argparse, setuptools | |
236 | * liberasurecode v1.2.0 or greater [3] | |
237 | * Erasure code backend libraries, gf-complete and Jerasure [1],[2], ISA-L [4] etc | |
238 | ||
239 | An example for ubuntu to install dependency packages:: | |
240 | ||
241 | $ sudo apt-get install build-essential python-dev python-pip liberasurecode-dev | |
242 | $ sudo pip install -U bindep -r test-requirement.txt | |
243 | ||
244 | If you want to confirm all dependency packages installed successfully, try:: | |
245 | ||
246 | $ sudo bindep -f bindep.txt | |
247 | ||
248 | *Note*: currently liberasurecode-dev/liberasurecode-devel in package repo is older than v1.2.0 | |
249 | ||
250 | Install PyECLib:: | |
251 | ||
252 | $ sudo python setup.py install | |
253 | ||
254 | Run test suite included:: | |
255 | ||
256 | $ ./.unittests | |
257 | ||
258 | If all of this works, then you should be good to go. If not, send us an email! | |
259 | ||
260 | If the test suite fails because it cannot find any of the shared libraries, | |
261 | then you probably need to add /usr/local/lib to the path searched when loading | |
262 | libraries. The best way to do this (on Linux) is to add '/usr/local/lib' to:: | |
263 | ||
264 | /etc/ld.so.conf | |
265 | ||
266 | and then make sure to run:: | |
267 | ||
268 | $ sudo ldconfig | |
269 | ||
270 | ||
271 | References | |
272 | ||
273 | [1] Jerasure, C library that supports erasure coding in storage applications, http://jerasure.org | |
274 | ||
275 | [2] Greenan, Kevin M et al, "Flat XOR-based erasure codes in storage systems", http://www.kaymgee.com/Kevin_Greenan/Publications_files/greenan-msst10.pdf | |
276 | ||
277 | [3] liberasurecode, C API abstraction layer for erasure coding backends, https://github.com/openstack/liberasurecode | |
278 | ||
279 | [4] Intel(R) Storage Acceleration Library (Open Source Version), https://01.org/intel%C2%AE-storage-acceleration-library-open-source-version | |
280 | ||
281 | [5] Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>, "NTT SHSS Erasure Coding backend" |
47 | 47 | # built documents. |
48 | 48 | # |
49 | 49 | # The short X.Y version. |
50 | version = '1.3.1' | |
50 | version = '1.4.0' | |
51 | 51 | # The full version, including alpha/beta/rc tags. |
52 | release = '1.3.1' | |
52 | release = '1.4.0' | |
53 | 53 | |
54 | 54 | # The language for content autogenerated by Sphinx. Refer to documentation |
55 | 55 | # for a list of supported languages. |
158 | 158 | |
159 | 159 | module = Extension('pyeclib_c', |
160 | 160 | define_macros=[('MAJOR VERSION', '1'), |
161 | ('MINOR VERSION', '3')], | |
161 | ('MINOR VERSION', '4')], | |
162 | 162 | include_dirs=[default_python_incdir, |
163 | 163 | 'src/c/pyeclib_c', |
164 | 164 | '/usr/include', |
171 | 171 | sources=['src/c/pyeclib_c/pyeclib_c.c']) |
172 | 172 | |
173 | 173 | setup(name='pyeclib', |
174 | version='1.3.1', | |
174 | version='1.4.0', | |
175 | 175 | author='Kevin Greenan', |
176 | 176 | author_email='kmgreen2@gmail.com', |
177 | 177 | maintainer='Kevin Greenan and Tushar Gohad', |
178 | 178 | maintainer_email='kmgreen2@gmail.com, tusharsg@gmail.com', |
179 | 179 | url='http://git.openstack.org/cgit/openstack/pyeclib/', |
180 | bugtrack_url='https://bugs.launchpad.net/pyeclib', | |
180 | 181 | description='This library provides a simple Python interface for \ |
181 | 182 | implementing erasure codes. To obtain the best possible \ |
182 | 183 | performance, the underlying erasure code algorithms are \ |