Commit upstream/2.2.0 - s3cmd

New upstream version 2.2.0 Gianfranco Costamagna 2 years ago

25 changed file(s) with 1538 addition(s) and 776 deletion(s). Raw diff Collapse all Expand all

+45

-0

NEWS less more

	0	s3cmd-2.2.0 - 2021-09-27
	1	===============
	2	* Added support for metadata modification of files bigger than 5 GiB
	3	* Added support for remote copy of files bigger than 5 GiB using MultiPart copy (Damian Martinez, Florent Viard)
	4	* Added progress info output for multipart copy and current-total info in output for cp, mv and modify
	5	* Added support for all special/foreign character names in object names to cp/mv/modify
	6	* Added support for SSL authentication (Aleksandr Chazov)
	7	* Added the http error 429 to the list of retryable errors (#1096)
	8	* Added support for listing and resuming of multipart uploads of more than 1000 parts (#346)
	9	* Added time based expiration for idle pool connections in order to avoid random broken pipe errors (#1114)
	10	* Added support for STS webidentity authentication (ie AssumeRole and AssumeRoleWithWebIdentity) (Samskeyti, Florent Viard)
	11	* Added support for custom headers to the mb command (#1197) (Sébastien Vajda)
	12	* Improved MultiPart copy to preserve acl and metadata of objects
	13	* Improved the server errors catching and reporting for cp/mv/modify commands
	14	* Improved resiliency against servers sending garbage responses (#1088, #1090, #1093)
	15	* Improved remote copy to have consistent copy of metadata in all cases: multipart or not, aws or not
	16	* Improved security by revoking public-write acl when private acl is set (#1151) (ruanzitao)
	17	* Improved speed when running on an EC2 instance (#1117) (Patrick Allain)
	18	* Reduced connection_max_age to 5s to avoid broken pipes as AWS closes https conns after around 6s (#1114)
	19	* Ensure that KeyboardInterrupt are always properly raised (#1089)
	20	* Changed sized of multipart copy chunks to 1 GiB
	21	* Fixed ValueError when using more than one ":" inside add_header in config file (#1087)
	22	* Fixed extra label issue when stdin used as source of a MultiPart upload
	23	* Fixed remote copy to allow changing the mime-type (ie content-type) of the new object
	24	* Fixed remote_copy to ensure that meta-s3cmd-attrs will be set based on the real source and not on the copy source
	25	* Fixed deprecation warnings due to invalid escape sequences (Karthikeyan Singaravelan)
	26	* Fixed getbucketinfo that was broken when the bucket lifecycle uses the filter element (Liu Lan)
	27	* Fixed RestoreRequest XML namespace URL (#1203) (Akete)
	28	* Fixed PARTIAL exit code that was not properly set when needed for object_get (#1190)
	29	* Fixed a possible inifinite loop when a file is truncated during hashsum or upload (#1125) (Matthew Krokosz, Florent Viard)
	30	* Fixed report_exception wrong error when LANG env var was not set (#1113)
	31	* Fixed wrong wiki url in error messages (Alec Barrett)
	32	* Py3: Fixed an AttributeError when using the "files-from" option
	33	* Py3: Fixed compatibility issues due to the removal of getchildren() from ElementTree in python 3.9 (#1146, #1157, #1162, # 1182, #1210) (Ondřej Budai)
	34	* Py3: Fixed compatibility issues due to the removal of encodestring() in python 3.9 (#1161, #1174) (Kentaro Kaneki)
	35	* Fixed a crash when the AWS_ACCESS_KEY env var is set but not AWS_SECRET_KEY (#1201)
	36	* Cleanup of check_md5 (Riccardo Magliocchetti)
	37	* Removed legacy code for dreamhost that should be necessary anymore
	38	* Migrated CI tests to use github actions (Arnaud Jaffre)
	39	* Improved README with a link to INSTALL.md (Sia Karamalegos)
	40	* Improved help content (Dmitrii Korostelev, Roland Van Laar)
	41	* Improvements for setup and build configurations
	42	* Many other bug fixes
	43
	44
0	45	s3cmd-2.1.0 - 2020-04-07
1	46	===============
2	47	* Changed size reporting using k instead of K as it a multiple of 1024 (#956)

-1

PKG-INFO less more

0	0	Metadata-Version: 1.2
1	1	Name: s3cmd
2		Version: 2.1.0
	2	Version: 2.2.0
3	3	Summary: Command line tool for managing Amazon S3 and CloudFront services
4	4	Home-page: http://s3tools.org
5	5	Author: Michal Ludvig

19	19	Authors:
20	20	--------
21	21	Florent Viard <florent@sodria.com>
	22
22	23	Michal Ludvig <michal@logix.cz>
	24
23	25	Matt Domsch (github.com/mdomsch)
24	26
25	27	Platform: UNKNOWN

-2

README.md less more

0	0	## S3cmd tool for Amazon Simple Storage Service (S3)
1	1
2		[![Build Status](https://travis-ci.org/s3tools/s3cmd.svg?branch=master)](https://travis-ci.org/s3tools/s3cmd)
	2	[![Build Status](https://github.com/s3tools/s3cmd/actions/workflows/test.yml/badge.svg)](https://github.com/s3tools/s3cmd/actions/workflows/test.yml)
3	3
4	4	* Author: Michal Ludvig, michal@logix.cz
5	5	* [Project homepage](http://s3tools.org)

14	14
15	15	S3cmd requires Python 2.6 or newer.
16	16	Python 3+ is also supported starting with S3cmd version 2.
	17
	18	See [installation instructions](https://github.com/s3tools/s3cmd/blob/master/INSTALL.md).
17	19
18	20
19	21	### What is S3cmd

196	198
197	199	As you can see we didn't have to create the `/somewhere` 'directory'. In fact it's only a filename prefix, not a real directory and it doesn't have to be created in any way beforehand.
198	200
199		In stead of using `put` with the `--recursive` option, you could also use the `sync` command:
	201	Instead of using `put` with the `--recursive` option, you could also use the `sync` command:
200	202
201	203	```
202	204	$ s3cmd sync dir1 dir2 s3://public.s3tools.org/somewhere/

+15

-2

S3/ACL.py less more

8	8	from __future__ import absolute_import, print_function
9	9
10	10	import sys
11		from .Utils import getTreeFromXml, deunicodise, encode_to_s3, decode_from_s3
	11	from .BaseUtils import getTreeFromXml, encode_to_s3, decode_from_s3
	12	from .Utils import deunicodise
12	13
13	14	try:
14	15	import xml.etree.ElementTree as ET

40	41
41	42	def isAnonRead(self):
42	43	return self.isAllUsers() and (self.permission == "READ" or self.permission == "FULL_CONTROL")
	44
	45	def isAnonWrite(self):
	46	return self.isAllUsers() and (self.permission == "WRITE" or self.permission == "FULL_CONTROL")
43	47
44	48	def getElement(self):
45	49	el = ET.Element("Grant")

126	130	return True
127	131	return False
128	132
	133	def isAnonWrite(self):
	134	for grantee in self.grantees:
	135	if grantee.isAnonWrite():
	136	return True
	137	return False
	138
129	139	def grantAnonRead(self):
130	140	if not self.isAnonRead():
131	141	self.appendGrantee(GranteeAnonRead())
132	142
133	143	def revokeAnonRead(self):
134	144	self.grantees = [g for g in self.grantees if not g.isAnonRead()]
	145
	146	def revokeAnonWrite(self):
	147	self.grantees = [g for g in self.grantees if not g.isAnonWrite()]
135	148
136	149	def appendGrantee(self, grantee):
137	150	self.grantees.append(grantee)

147	160	elif grantee.permission.upper() == permission:
148	161	return True
149	162
150		return False;
	163	return False
151	164
152	165	def grant(self, name, permission):
153	166	if self.hasGrant(name, permission):

-1

S3/AccessLog.py less more

11	11
12	12	from . import S3Uri
13	13	from .Exceptions import ParameterError
14		from .Utils import getTreeFromXml, decode_from_s3
	14	from .BaseUtils import getTreeFromXml, decode_from_s3
15	15	from .ACL import GranteeAnonRead
16	16
17	17	try:

+338

-0

S3/BaseUtils.py less more

	0	# -- coding: utf-8 --
	1
	2	## Amazon S3 manager
	3	## Author: Michal Ludvig <michal@logix.cz>
	4	## http://www.logix.cz/michal
	5	## License: GPL Version 2
	6	## Copyright: TGRMN Software and contributors
	7
	8	from __future__ import absolute_import, division
	9
	10	import re
	11	import sys
	12
	13	from calendar import timegm
	14	from logging import debug, warning, error
	15
	16	import xml.dom.minidom
	17	import xml.etree.ElementTree as ET
	18
	19	from .ExitCodes import EX_OSFILE
	20
	21	try:
	22	import dateutil.parser
	23	except ImportError:
	24	sys.stderr.write(u"""
	25	!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
	26	ImportError trying to import dateutil.parser.
	27	Please install the python dateutil module:
	28	$ sudo apt-get install python-dateutil
	29	or
	30	$ sudo yum install python-dateutil
	31	or
	32	$ pip install python-dateutil
	33	!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
	34	""")
	35	sys.stderr.flush()
	36	sys.exit(EX_OSFILE)
	37
	38	try:
	39	from urllib import quote
	40	except ImportError:
	41	# python 3 support
	42	from urllib.parse import quote
	43
	44	try:
	45	unicode
	46	except NameError:
	47	# python 3 support
	48	# In python 3, unicode -> str, and str -> bytes
	49	unicode = str
	50
	51
	52	__all__ = []
	53
	54
	55	RE_S3_DATESTRING = re.compile('\.[0-9](?:[Z\\-\\+]?)')
	56	RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s\|\s)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
	57
	58
	59	# Date and time helpers
	60
	61
	62	def dateS3toPython(date):
	63	# Reset milliseconds to 000
	64	date = RE_S3_DATESTRING.sub(".000", date)
	65	return dateutil.parser.parse(date, fuzzy=True)
	66	__all__.append("dateS3toPython")
	67
	68
	69	def dateS3toUnix(date):
	70	## NOTE: This is timezone-aware and return the timestamp regarding GMT
	71	return timegm(dateS3toPython(date).utctimetuple())
	72	__all__.append("dateS3toUnix")
	73
	74
	75	def dateRFC822toPython(date):
	76	"""
	77	Convert a string formated like '2020-06-27T15:56:34Z' into a python datetime
	78	"""
	79	return dateutil.parser.parse(date, fuzzy=True)
	80	__all__.append("dateRFC822toPython")
	81
	82
	83	def dateRFC822toUnix(date):
	84	return timegm(dateRFC822toPython(date).utctimetuple())
	85	__all__.append("dateRFC822toUnix")
	86
	87
	88	def formatDateTime(s3timestamp):
	89	date_obj = dateutil.parser.parse(s3timestamp, fuzzy=True)
	90	return date_obj.strftime("%Y-%m-%d %H:%M")
	91	__all__.append("formatDateTime")
	92
	93
	94	# Encoding / Decoding
	95
	96
	97	def base_unicodise(string, encoding='UTF-8', errors='replace', silent=False):
	98	"""
	99	Convert 'string' to Unicode or raise an exception.
	100	"""
	101	if type(string) == unicode:
	102	return string
	103
	104	if not silent:
	105	debug("Unicodising %r using %s" % (string, encoding))
	106	try:
	107	return unicode(string, encoding, errors)
	108	except UnicodeDecodeError:
	109	raise UnicodeDecodeError("Conversion to unicode failed: %r" % string)
	110	__all__.append("base_unicodise")
	111
	112
	113	def base_deunicodise(string, encoding='UTF-8', errors='replace', silent=False):
	114	"""
	115	Convert unicode 'string' to <type str>, by default replacing
	116	all invalid characters with '?' or raise an exception.
	117	"""
	118	if type(string) != unicode:
	119	return string
	120
	121	if not silent:
	122	debug("DeUnicodising %r using %s" % (string, encoding))
	123	try:
	124	return string.encode(encoding, errors)
	125	except UnicodeEncodeError:
	126	raise UnicodeEncodeError("Conversion from unicode failed: %r" % string)
	127	__all__.append("base_deunicodise")
	128
	129
	130	def decode_from_s3(string, errors = "replace"):
	131	"""
	132	Convert S3 UTF-8 'string' to Unicode or raise an exception.
	133	"""
	134	return base_unicodise(string, "UTF-8", errors, True)
	135	__all__.append("decode_from_s3")
	136
	137
	138	def encode_to_s3(string, errors='replace'):
	139	"""
	140	Convert Unicode to S3 UTF-8 'string', by default replacing
	141	all invalid characters with '?' or raise an exception.
	142	"""
	143	return base_deunicodise(string, "UTF-8", errors, True)
	144	__all__.append("encode_to_s3")
	145
	146
	147	def s3_quote(param, quote_backslashes=True, unicode_output=False):
	148	"""
	149	URI encode every byte. UriEncode() must enforce the following rules:
	150	- URI encode every byte except the unreserved characters: 'A'-'Z', 'a'-'z', '0'-'9', '-', '.', '_', and '~'.
	151	- The space character is a reserved character and must be encoded as "%20" (and not as "+").
	152	- Each URI encoded byte is formed by a '%' and the two-digit hexadecimal value of the byte.
	153	- Letters in the hexadecimal value must be uppercase, for example "%1A".
	154	- Encode the forward slash character, '/', everywhere except in the object key name.
	155	For example, if the object key name is photos/Jan/sample.jpg, the forward slash in the key name is not encoded.
	156	"""
	157	if quote_backslashes:
	158	safe_chars = "~"
	159	else:
	160	safe_chars = "~/"
	161	param = encode_to_s3(param)
	162	param = quote(param, safe=safe_chars)
	163	if unicode_output:
	164	param = decode_from_s3(param)
	165	else:
	166	param = encode_to_s3(param)
	167	return param
	168	__all__.append("s3_quote")
	169
	170
	171	def base_urlencode_string(string, urlencoding_mode = None, unicode_output=False):
	172	string = encode_to_s3(string)
	173
	174	if urlencoding_mode == "verbatim":
	175	## Don't do any pre-processing
	176	return string
	177
	178	encoded = quote(string, safe="~/")
	179	debug("String '%s' encoded to '%s'" % (string, encoded))
	180	if unicode_output:
	181	return decode_from_s3(encoded)
	182	else:
	183	return encode_to_s3(encoded)
	184	__all__.append("base_urlencode_string")
	185
	186
	187	def base_replace_nonprintables(string, with_message=False):
	188	"""
	189	replace_nonprintables(string)
	190
	191	Replaces all non-printable characters 'ch' in 'string'
	192	where ord(ch) <= 26 with ^@, ^A, ... ^Z
	193	"""
	194	new_string = ""
	195	modified = 0
	196	for c in string:
	197	o = ord(c)
	198	if (o <= 31):
	199	new_string += "^" + chr(ord('@') + o)
	200	modified += 1
	201	elif (o == 127):
	202	new_string += "^?"
	203	modified += 1
	204	else:
	205	new_string += c
	206	if modified and with_message:
	207	warning("%d non-printable characters replaced in: %s" % (modified, new_string))
	208	return new_string
	209	__all__.append("base_replace_nonprintables")
	210
	211
	212	# XML helpers
	213
	214
	215	def parseNodes(nodes):
	216	## WARNING: Ignores text nodes from mixed xml/text.
	217	## For instance <tag1>some text<tag2>other text</tag2></tag1>
	218	## will be ignore "some text" node
	219	## WARNING 2: Any node at first level without children will also be ignored
	220	retval = []
	221	for node in nodes:
	222	retval_item = {}
	223	for child in node:
	224	name = decode_from_s3(child.tag)
	225	if len(child):
	226	retval_item[name] = parseNodes([child])
	227	else:
	228	found_text = node.findtext(".//%s" % child.tag)
	229	if found_text is not None:
	230	retval_item[name] = decode_from_s3(found_text)
	231	else:
	232	retval_item[name] = None
	233	if retval_item:
	234	retval.append(retval_item)
	235	return retval
	236	__all__.append("parseNodes")
	237
	238
	239	def getPrettyFromXml(xmlstr):
	240	xmlparser = xml.dom.minidom.parseString(xmlstr)
	241	return xmlparser.toprettyxml()
	242
	243	__all__.append("getPrettyFromXml")
	244
	245
	246	def stripNameSpace(xml):
	247	"""
	248	removeNameSpace(xml) -- remove top-level AWS namespace
	249	Operate on raw byte(utf-8) xml string. (Not unicode)
	250	"""
	251	xmlns_match = RE_XML_NAMESPACE.match(xml)
	252	if xmlns_match:
	253	xmlns = xmlns_match.group(3)
	254	xml = RE_XML_NAMESPACE.sub("\\1\\2", xml, 1)
	255	else:
	256	xmlns = None
	257	return xml, xmlns
	258	__all__.append("stripNameSpace")
	259
	260
	261	def getTreeFromXml(xml):
	262	xml, xmlns = stripNameSpace(encode_to_s3(xml))
	263	try:
	264	tree = ET.fromstring(xml)
	265	if xmlns:
	266	tree.attrib['xmlns'] = xmlns
	267	return tree
	268	except Exception as e:
	269	error("Error parsing xml: %s", e)
	270	error(xml)
	271	raise
	272	__all__.append("getTreeFromXml")
	273
	274
	275	def getListFromXml(xml, node):
	276	tree = getTreeFromXml(xml)
	277	nodes = tree.findall('.//%s' % (node))
	278	return parseNodes(nodes)
	279	__all__.append("getListFromXml")
	280
	281
	282	def getDictFromTree(tree):
	283	ret_dict = {}
	284	for child in tree:
	285	if len(child):
	286	## Complex-type child. Recurse
	287	content = getDictFromTree(child)
	288	else:
	289	content = decode_from_s3(child.text) if child.text is not None else None
	290	child_tag = decode_from_s3(child.tag)
	291	if child_tag in ret_dict:
	292	if not type(ret_dict[child_tag]) == list:
	293	ret_dict[child_tag] = [ret_dict[child_tag]]
	294	ret_dict[child_tag].append(content or "")
	295	else:
	296	ret_dict[child_tag] = content or ""
	297	return ret_dict
	298	__all__.append("getDictFromTree")
	299
	300
	301	def getTextFromXml(xml, xpath):
	302	tree = getTreeFromXml(xml)
	303	if tree.tag.endswith(xpath):
	304	return decode_from_s3(tree.text) if tree.text is not None else None
	305	else:
	306	result = tree.findtext(xpath)
	307	return decode_from_s3(result) if result is not None else None
	308	__all__.append("getTextFromXml")
	309
	310
	311	def getRootTagName(xml):
	312	tree = getTreeFromXml(xml)
	313	return decode_from_s3(tree.tag) if tree.tag is not None else None
	314	__all__.append("getRootTagName")
	315
	316
	317	def xmlTextNode(tag_name, text):
	318	el = ET.Element(tag_name)
	319	el.text = decode_from_s3(text)
	320	return el
	321	__all__.append("xmlTextNode")
	322
	323
	324	def appendXmlTextNode(tag_name, text, parent):
	325	"""
	326	Creates a new <tag_name> Node and sets
	327	its content to 'text'. Then appends the
	328	created Node to 'parent' element if given.
	329	Returns the newly created Node.
	330	"""
	331	el = xmlTextNode(tag_name, text)
	332	parent.append(el)
	333	return el
	334	__all__.append("appendXmlTextNode")
	335
	336
	337	# vim:et:ts=4:sts=4:ai

-7

S3/CloudFront.py less more

21	21	from .S3 import S3
22	22	from .Config import Config
23	23	from .Exceptions import *
24		from .Utils import (getTreeFromXml, appendXmlTextNode, getDictFromTree,
25		dateS3toPython, getBucketFromHostname,
26		getHostnameFromBucket, deunicodise, urlencode_string,
27		convertHeaderTupleListToDict, encode_to_s3, decode_from_s3)
	24	from .BaseUtils import (getTreeFromXml, appendXmlTextNode, getDictFromTree,
	25	dateS3toPython, encode_to_s3, decode_from_s3)
	26	from .Utils import (getBucketFromHostname, getHostnameFromBucket, deunicodise,
	27	urlencode_string, convertHeaderTupleListToDict)
28	28	from .Crypto import sign_string_v2
29	29	from .S3Uri import S3Uri, S3UriS3
30	30	from .ConnMan import ConnMan

482	482	fp.write(deunicodise("\n".join(paths)+"\n"))
483	483	warning("Request to invalidate %d paths (max 999 supported)" % len(paths))
484	484	warning("All the paths are now saved in: %s" % tmp_filename)
485		except:
	485	except Exception:
486	486	pass
487	487	raise ParameterError("Too many paths to invalidate")
488	488

621	621	# do this since S3 buckets that are set up as websites use custom origins.
622	622	# Thankfully, the custom origin URLs they use start with the URL of the
623	623	# S3 bucket. Here, we make use this naming convention to support this use case.
624		distListIndex = getBucketFromHostname(d.info['CustomOrigin']['DNSName'])[0];
	624	distListIndex = getBucketFromHostname(d.info['CustomOrigin']['DNSName'])[0]
625	625	distListIndex = distListIndex[:len(uri.bucket())]
626	626	else:
627	627	# Aral: I'm not sure when this condition will be reached, but keeping it in there.

802	802	try:
803	803	for i in inval_list['inval_list'].info['InvalidationSummary']:
804	804	requests.append("/".join(["cf:/", cfuri.dist_id(), i["Id"]]))
805		except:
	805	except Exception:
806	806	continue
807	807	for req in requests:
808	808	cfuri = S3Uri(req)

+189

-78

S3/Config.py less more

8	8	from __future__ import absolute_import
9	9
10	10	import logging
11		from logging import debug, warning, error
	11	import datetime
	12	import locale
12	13	import re
13	14	import os
14	15	import io
15	16	import sys
16	17	import json
17		from . import Progress
18		from .SortedDict import SortedDict
	18	import time
	19
	20	from logging import debug, warning
	21
	22	from .ExitCodes import EX_OSFILE
	23
	24	try:
	25	import dateutil.parser
	26	import dateutil.tz
	27	except ImportError:
	28	sys.stderr.write(u"""
	29	!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
	30	ImportError trying to import dateutil.parser and dateutil.tz.
	31	Please install the python dateutil module:
	32	$ sudo apt-get install python-dateutil
	33	or
	34	$ sudo yum install python-dateutil
	35	or
	36	$ pip install python-dateutil
	37	!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
	38	""")
	39	sys.stderr.flush()
	40	sys.exit(EX_OSFILE)
	41
19	42	try:
20	43	# python 3 support
21	44	import httplib
22	45	except ImportError:
23	46	import http.client as httplib
24		import locale
25	47
26	48	try:
27		from configparser import (NoOptionError, NoSectionError,
28		MissingSectionHeaderError, ParsingError,
29		ConfigParser as PyConfigParser)
	49	from configparser import (NoOptionError, NoSectionError,
	50	MissingSectionHeaderError, ParsingError,
	51	ConfigParser as PyConfigParser)
30	52	except ImportError:
31		# Python2 fallback code
32		from ConfigParser import (NoOptionError, NoSectionError,
33		MissingSectionHeaderError, ParsingError,
34		ConfigParser as PyConfigParser)
	53	# Python2 fallback code
	54	from ConfigParser import (NoOptionError, NoSectionError,
	55	MissingSectionHeaderError, ParsingError,
	56	ConfigParser as PyConfigParser)
	57
	58	from . import Progress
	59	from .SortedDict import SortedDict
	60	from .BaseUtils import (s3_quote, getTreeFromXml, getDictFromTree,
	61	base_unicodise, dateRFC822toPython)
	62
35	63
36	64	try:
37	65	unicode

40	68	# In python 3, unicode -> str, and str -> bytes
41	69	unicode = str
42	70
43		def config_unicodise(string, encoding = "utf-8", errors = "replace"):
44		"""
45		Convert 'string' to Unicode or raise an exception.
46		Config can't use toolbox from Utils that is itself using Config
47		"""
48		if type(string) == unicode:
49		return string
50
51		try:
52		return unicode(string, encoding, errors)
53		except UnicodeDecodeError:
54		raise UnicodeDecodeError("Conversion to unicode failed: %r" % string)
55	71
56	72	def is_bool_true(value):
57	73	"""Check to see if a string is true, yes, on, or 1

67	83	else:
68	84	return False
69	85
	86
70	87	def is_bool_false(value):
71	88	"""Check to see if a string is false, no, off, or 0
72	89

81	98	else:
82	99	return False
83	100
	101
84	102	def is_bool(value):
85	103	"""Check a string value to see if it is bool"""
86	104	return is_bool_true(value) or is_bool_false(value)
	105
87	106
88	107	class Config(object):
89	108	_instance = None

93	112	secret_key = u""
94	113	access_token = u""
95	114	_access_token_refresh = True
	115	_access_token_expiration = None
	116	_access_token_last_update = None
96	117	host_base = u"s3.amazonaws.com"
97	118	host_bucket = u"%(bucket)s.s3.amazonaws.com"
98	119	kms_key = u"" #can't set this and Server Side Encryption at the same time

152	173	gpg_decrypt = u"%(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s"
153	174	use_https = True
154	175	ca_certs_file = u""
	176	ssl_client_key_file = u""
	177	ssl_client_cert_file = u""
155	178	check_ssl_certificate = True
156	179	check_ssl_hostname = True
157	180	bucket_location = u"US"

160	183	use_mime_magic = True
161	184	mime_type = u""
162	185	enable_multipart = True
163		multipart_chunk_size_mb = 15 # MB
164		multipart_max_chunks = 10000 # Maximum chunks on AWS S3, could be different on other S3-compatible APIs
	186	# Chunk size is at the same time the chunk size and the threshold
	187	multipart_chunk_size_mb = 15 # MiB
	188	# Maximum chunk size for s3-to-s3 copy is 5 GiB.
	189	# But, use a lot lower value by default (1GiB)
	190	multipart_copy_chunk_size_mb = 1 * 1024
	191	# Maximum chunks on AWS S3, could be different on other S3-compatible APIs
	192	multipart_max_chunks = 10000
165	193	# List of checks to be performed for 'sync'
166	194	sync_checks = ['size', 'md5'] # 'weak-timestamp'
167	195	# List of compiled REGEXPs

210	238	throttle_max = 100
211	239	public_url_use_https = False
212	240	connection_pooling = True
	241	# How long in seconds a connection can be kept idle in the pool and still
	242	# be alive. AWS s3 is supposed to close connections that are idle for 20
	243	# seconds or more, but in real life, undocumented, it closes https conns
	244	# after around 6s of inactivity.
	245	connection_max_age = 5
213	246
214	247	## Creating a singleton
215	248	def __new__(self, configfile = None, access_key=None, secret_key=None, access_token=None):

234	267	# Do not refresh the IAM role when an access token is provided.
235	268	self._access_token_refresh = False
236	269
237		if len(self.access_key)==0:
	270	if len(self.access_key) == 0:
238	271	env_access_key = os.getenv('AWS_ACCESS_KEY') or os.getenv('AWS_ACCESS_KEY_ID')
239	272	env_secret_key = os.getenv('AWS_SECRET_KEY') or os.getenv('AWS_SECRET_ACCESS_KEY')
240	273	env_access_token = os.getenv('AWS_SESSION_TOKEN') or os.getenv('AWS_SECURITY_TOKEN')
241	274	if env_access_key:
	275	if not env_secret_key:
	276	raise ValueError(
	277	"AWS_ACCESS_KEY environment variable is used but"
	278	" AWS_SECRET_KEY variable is missing"
	279	)
242	280	# py3 getenv returns unicode and py2 returns bytes.
243		self.access_key = config_unicodise(env_access_key)
244		self.secret_key = config_unicodise(env_secret_key)
	281	self.access_key = base_unicodise(env_access_key)
	282	self.secret_key = base_unicodise(env_secret_key)
245	283	if env_access_token:
246	284	# Do not refresh the IAM role when an access token is provided.
247	285	self._access_token_refresh = False
248		self.access_token = config_unicodise(env_access_token)
	286	self.access_token = base_unicodise(env_access_token)
249	287	else:
250	288	self.role_config()
251	289

257	295
258	296	def role_config(self):
259	297	"""
260		Get credentials from IAM authentication
	298	Get credentials from IAM authentication and STS AssumeRole
261	299	"""
262	300	try:
263		conn = httplib.HTTPConnection(host='169.254.169.254', timeout = 2)
264		conn.request('GET', "/latest/meta-data/iam/security-credentials/")
265		resp = conn.getresponse()
266		files = resp.read()
267		if resp.status == 200 and len(files)>1:
268		conn.request('GET', "/latest/meta-data/iam/security-credentials/%s" % files.decode('utf-8'))
269		resp=conn.getresponse()
270		if resp.status == 200:
271		resp_content = config_unicodise(resp.read())
272		creds=json.loads(resp_content)
273		Config().update_option('access_key', config_unicodise(creds['AccessKeyId']))
274		Config().update_option('secret_key', config_unicodise(creds['SecretAccessKey']))
275		Config().update_option('access_token', config_unicodise(creds['Token']))
	301	role_arn = os.environ.get('AWS_ROLE_ARN')
	302	if role_arn:
	303	role_session_name = 'role-session-%s' % (int(time.time()))
	304	params = {
	305	'Action': 'AssumeRole',
	306	'Version': '2011-06-15',
	307	'RoleArn': role_arn,
	308	'RoleSessionName': role_session_name,
	309	}
	310	web_identity_token_file = os.environ.get('AWS_WEB_IDENTITY_TOKEN_FILE')
	311	if web_identity_token_file:
	312	with open(web_identity_token_file) as f:
	313	web_identity_token = f.read()
	314	params['Action'] = 'AssumeRoleWithWebIdentity'
	315	params['WebIdentityToken'] = web_identity_token
	316	encoded_params = '&'.join([
	317	'%s=%s' % (k, s3_quote(v, unicode_output=True))
	318	for k, v in params.items()
	319	])
	320	conn = httplib.HTTPSConnection(host='sts.amazonaws.com',
	321	timeout=2)
	322	conn.request('POST', '/?' + encoded_params)
	323	resp = conn.getresponse()
	324	resp_content = resp.read()
	325	if resp.status == 200 and len(resp_content) > 1:
	326	tree = getTreeFromXml(resp_content)
	327	result_dict = getDictFromTree(tree)
	328	if tree.tag == "AssumeRoleResponse":
	329	creds = result_dict['AssumeRoleResult']['Credentials']
	330	elif tree.tag == "AssumeRoleWithWebIdentityResponse":
	331	creds = result_dict['AssumeRoleWithWebIdentityResult']['Credentials']
	332	else:
	333	raise IOError("Unexpected XML message from STS server: <%s />" % tree.tag)
	334	Config().update_option('access_key', creds['AccessKeyId'])
	335	Config().update_option('secret_key', creds['SecretAccessKey'])
	336	Config().update_option('access_token', creds['SessionToken'])
	337	expiration = dateRFC822toPython(base_unicodise(creds['Expiration']))
	338	# Add a timedelta to prevent any expiration if the EC2 machine is not at the right date
	339	self._access_token_expiration = expiration - datetime.timedelta(minutes=15)
	340	# last update date is not provided in STS responses
	341	self._access_token_last_update = datetime.datetime.now(dateutil.tz.tzutc())
	342	# Others variables : Code / Type
276	343	else:
277	344	raise IOError
278	345	else:
279		raise IOError
	346	conn = httplib.HTTPConnection(host='169.254.169.254',
	347	timeout=2)
	348	conn.request('GET', "/latest/meta-data/iam/security-credentials/")
	349	resp = conn.getresponse()
	350	files = resp.read()
	351	if resp.status == 200 and len(files) > 1:
	352	conn.request('GET', "/latest/meta-data/iam/security-credentials/%s" % files.decode('utf-8'))
	353	resp=conn.getresponse()
	354	if resp.status == 200:
	355	resp_content = base_unicodise(resp.read())
	356	creds=json.loads(resp_content)
	357	Config().update_option('access_key', base_unicodise(creds['AccessKeyId']))
	358	Config().update_option('secret_key', base_unicodise(creds['SecretAccessKey']))
	359	Config().update_option('access_token', base_unicodise(creds['Token']))
	360	expiration = dateRFC822toPython(base_unicodise(creds['Expiration']))
	361	# Add a timedelta to prevent any expiration if the EC2 machine is not at the right date
	362	self._access_token_expiration = expiration - datetime.timedelta(minutes=15)
	363	self._access_token_last_update = dateRFC822toPython(base_unicodise(creds['LastUpdated']))
	364	# Others variables : Code / Type
	365	else:
	366	raise IOError
	367	else:
	368	raise IOError
280	369	except:
281	370	raise
282	371
283	372	def role_refresh(self):
284	373	if self._access_token_refresh:
	374	now = datetime.datetime.now(dateutil.tz.tzutc())
	375	if self._access_token_expiration \
	376	and now < self._access_token_expiration \
	377	and self._access_token_last_update \
	378	and self._access_token_last_update <= now:
	379	# current token is still valid. No need to refresh it
	380	return
285	381	try:
286	382	self.role_config()
287		except:
	383	except Exception:
288	384	warning("Could not refresh role")
289	385
290	386	def aws_credential_file(self):

293	389	credential_file_from_env = os.environ.get('AWS_CREDENTIAL_FILE')
294	390	if credential_file_from_env and \
295	391	os.path.isfile(credential_file_from_env):
296		aws_credential_file = config_unicodise(credential_file_from_env)
	392	aws_credential_file = base_unicodise(credential_file_from_env)
297	393	elif not os.path.isfile(aws_credential_file):
298	394	return
299
300		warning("Errno %d accessing credentials file %s" % (e.errno, aws_credential_file))
301	395
302	396	config = PyConfigParser()
303	397

311	405	# but so far readfp it is still available.
312	406	config.readfp(io.StringIO(config_string))
313	407	except MissingSectionHeaderError:
314		# if header is missing, this could be deprecated credentials file format
315		# as described here: https://blog.csanchez.org/2011/05/
	408	# if header is missing, this could be deprecated
	409	# credentials file format as described here:
	410	# https://blog.csanchez.org/2011/05/
316	411	# then do the hacky-hack and add default header
317	412	# to be able to read the file with PyConfigParser()
318	413	config_string = u'[default]\n' + config_string

322	417	"Error reading aws_credential_file "
323	418	"(%s): %s" % (aws_credential_file, str(exc)))
324	419
325		profile = config_unicodise(os.environ.get('AWS_PROFILE', "default"))
	420	profile = base_unicodise(os.environ.get('AWS_PROFILE', "default"))
326	421	debug("Using AWS profile '%s'" % (profile))
327	422
328	423	# get_key - helper function to read the aws profile credentials
329		# including the legacy ones as described here: https://blog.csanchez.org/2011/05/
	424	# including the legacy ones as described here:
	425	# https://blog.csanchez.org/2011/05/
330	426	def get_key(profile, key, legacy_key, print_warning=True):
331	427	result = None
332	428
333	429	try:
334	430	result = config.get(profile, key)
335	431	except NoOptionError as e:
336		if print_warning: # we may want to skip warning message for optional keys
337		warning("Couldn't find key '%s' for the AWS Profile '%s' in the credentials file '%s'" % (e.option, e.section, aws_credential_file))
338		if legacy_key: # if the legacy_key defined and original one wasn't found, try read the legacy_key
	432	# we may want to skip warning message for optional keys
	433	if print_warning:
	434	warning("Couldn't find key '%s' for the AWS Profile "
	435	"'%s' in the credentials file '%s'",
	436	e.option, e.section, aws_credential_file)
	437	# if the legacy_key defined and original one wasn't found,
	438	# try read the legacy_key
	439	if legacy_key:
339	440	try:
340	441	key = legacy_key
341	442	profile = "default"
342	443	result = config.get(profile, key)
343	444	warning(
344		"Legacy configuratin key '%s' used, " % (key) +
345		"please use the standardized config format as described here: " +
346		"https://aws.amazon.com/blogs/security/a-new-and-standardized-way-to-manage-credentials-in-the-aws-sdks/"
347		)
	445	"Legacy configuratin key '%s' used, please use"
	446	" the standardized config format as described "
	447	"here: https://aws.amazon.com/blogs/security/a-new-and-standardized-way-to-manage-credentials-in-the-aws-sdks/",
	448	key)
348	449	except NoOptionError as e:
349	450	pass
350	451
351	452	if result:
352		debug("Found the configuration option '%s' for the AWS Profile '%s' in the credentials file %s" % (key, profile, aws_credential_file))
	453	debug("Found the configuration option '%s' for the AWS "
	454	"Profile '%s' in the credentials file %s",
	455	key, profile, aws_credential_file)
353	456	return result
354	457
355		profile_access_key = get_key(profile, "aws_access_key_id", "AWSAccessKeyId")
	458	profile_access_key = get_key(profile, "aws_access_key_id",
	459	"AWSAccessKeyId")
356	460	if profile_access_key:
357		Config().update_option('access_key', config_unicodise(profile_access_key))
358
359		profile_secret_key = get_key(profile, "aws_secret_access_key", "AWSSecretKey")
	461	Config().update_option('access_key',
	462	base_unicodise(profile_access_key))
	463
	464	profile_secret_key = get_key(profile, "aws_secret_access_key",
	465	"AWSSecretKey")
360	466	if profile_secret_key:
361		Config().update_option('secret_key', config_unicodise(profile_secret_key))
362
363		profile_access_token = get_key(profile, "aws_session_token", None, False)
	467	Config().update_option('secret_key',
	468	base_unicodise(profile_secret_key))
	469
	470	profile_access_token = get_key(profile, "aws_session_token", None,
	471	False)
364	472	if profile_access_token:
365		Config().update_option('access_token', config_unicodise(profile_access_token))
	473	Config().update_option('access_token',
	474	base_unicodise(profile_access_token))
366	475
367	476	except IOError as e:
368		warning("Errno %d accessing credentials file %s" % (e.errno, aws_credential_file))
	477	warning("Errno %d accessing credentials file %s", e.errno,
	478	aws_credential_file)
369	479	except NoSectionError as e:
370		warning("Couldn't find AWS Profile '%s' in the credentials file '%s'" % (profile, aws_credential_file))
	480	warning("Couldn't find AWS Profile '%s' in the credentials file "
	481	"'%s'", profile, aws_credential_file)
371	482
372	483	def option_list(self):
373	484	retval = []

398	509
399	510	if cp.get('add_headers'):
400	511	for option in cp.get('add_headers').split(","):
401		(key, value) = option.split(':')
	512	(key, value) = option.split(':', 1)
402	513	self.extra_headers[key.strip()] = value.strip()
403	514
404	515	self._parsed_files.append(configfile)

441	552	shift = 0
442	553	try:
443	554	value = shift and int(value[:-1]) << shift or int(value)
444		except:
	555	except Exception:
445	556	raise ValueError("Config: value of option %s must have suffix m, k, or nothing, not '%s'" % (option, value))
446	557
447	558	## allow yes/no, true/false, on/off and 1/0 for boolean options
448	559	## Some options default to None, if that's the case check the value to see if it is bool
449	560	elif (type(getattr(Config, option)) is type(True) or # Config is bool
450		(getattr(Config, option) is None and is_bool(value))): # Config is None and value is bool
	561	(getattr(Config, option) is None and is_bool(value))): # Config is None and value is bool
451	562	if is_bool_true(value):
452	563	value = True
453	564	elif is_bool_false(value):

480	591	if type(sections) != type([]):
481	592	sections = [sections]
482	593	in_our_section = True
483		r_comment = re.compile("^\s#.")
484		r_empty = re.compile("^\s*$")
485		r_section = re.compile("^\[([^\]]+)\]")
486		r_data = re.compile("^\s(?P<key>\w+)\s=\s(?P<value>.)")
487		r_quotes = re.compile("^\"(.)\"\s$")
	594	r_comment = re.compile(r'^\s#.')
	595	r_empty = re.compile(r'^\s*$')
	596	r_section = re.compile(r'^\[([^\]]+)\]')
	597	r_data = re.compile(r'^\s(?P<key>\w+)\s=\s(?P<value>.)')
	598	r_quotes = re.compile(r'^"(.)"\s$')
488	599	with io.open(file, "r", encoding=self.get('encoding', 'UTF-8')) as fp:
489	600	for line in fp:
490	601	if r_comment.match(line) or r_empty.match(line):

+50

-12

S3/ConnMan.py less more

13	13	else:
14	14	from .Custom_httplib27 import httplib
15	15	import ssl
	16
	17	from logging import debug
16	18	from threading import Semaphore
17		from logging import debug
	19	from time import time
18	20	try:
19	21	# python 3 support
20	22	from urlparse import urlparse

60	62	return context
61	63
62	64	@staticmethod
	65	def _ssl_client_auth_context(certfile, keyfile, check_server_cert, cafile):
	66	context = None
	67	try:
	68	cert_reqs = ssl.CERT_REQUIRED if check_server_cert else ssl.CERT_NONE
	69	context = ssl._create_unverified_context(cafile=cafile,
	70	keyfile=keyfile,
	71	certfile=certfile,
	72	cert_reqs=cert_reqs)
	73	except AttributeError: # no ssl._create_unverified_context
	74	pass
	75	return context
	76
	77	@staticmethod
63	78	def _ssl_context():
64	79	if http_connection.context_set:
65	80	return http_connection.context

68	83	cafile = cfg.ca_certs_file
69	84	if cafile == "":
70	85	cafile = None
	86	certfile = cfg.ssl_client_cert_file or None
	87	keyfile = cfg.ssl_client_key_file or None # the key may be embedded into cert file
	88
71	89	debug(u"Using ca_certs_file %s", cafile)
72
73		if cfg.check_ssl_certificate:
	90	debug(u"Using ssl_client_cert_file %s", certfile)
	91	debug(u"Using ssl_client_key_file %s", keyfile)
	92
	93	if certfile is not None:
	94	context = http_connection._ssl_client_auth_context(certfile, keyfile, cfg.check_ssl_certificate, cafile)
	95	elif cfg.check_ssl_certificate:
74	96	context = http_connection._ssl_verified_context(cafile)
75	97	else:
76	98	context = http_connection._ssl_unverified_context(cafile)

217	239	debug(u'proxied HTTPConnection(%s, %s)', cfg.proxy_host, cfg.proxy_port)
218	240	# No tunnel here for the moment
219	241
	242	self.last_used_time = time()
220	243
221	244	class ConnMan(object):
222	245	_CS_REQ_SENT = httplib._CS_REQ_SENT
223	246	CONTINUE = httplib.CONTINUE
224	247	conn_pool_sem = Semaphore()
225	248	conn_pool = {}
226		conn_max_counter = 800 ## AWS closes connection after some ~90 requests
227
228		@staticmethod
229		def get(hostname, ssl = None):
	249	conn_max_counter = 800 ## AWS closes connection after some ~90 requests
	250
	251	@staticmethod
	252	def get(hostname, ssl=None):
230	253	cfg = Config()
231		if ssl == None:
	254	if ssl is None:
232	255	ssl = cfg.use_https
233	256	conn = None
234	257	if cfg.proxy_host != "":

240	263	ConnMan.conn_pool_sem.acquire()
241	264	if conn_id not in ConnMan.conn_pool:
242	265	ConnMan.conn_pool[conn_id] = []
243		if len(ConnMan.conn_pool[conn_id]):
	266	while ConnMan.conn_pool[conn_id]:
244	267	conn = ConnMan.conn_pool[conn_id].pop()
245		debug("ConnMan.get(): re-using connection: %s#%d" % (conn.id, conn.counter))
	268	cur_time = time()
	269	if cur_time < conn.last_used_time + cfg.connection_max_age \
	270	and cur_time >= conn.last_used_time:
	271	debug("ConnMan.get(): re-using connection: %s#%d"
	272	% (conn.id, conn.counter))
	273	break
	274	# Conn is too old or wall clock went back in the past
	275	debug("ConnMan.get(): closing expired connection")
	276	ConnMan.close(conn)
	277	conn = None
	278
246	279	ConnMan.conn_pool_sem.release()
247	280	if not conn:
248	281	debug("ConnMan.get(): creating new connection: %s" % conn_id)

257	290	def put(conn):
258	291	if conn.id.startswith("proxy://"):
259	292	ConnMan.close(conn)
260		debug("ConnMan.put(): closing proxy connection (keep-alive not yet supported)")
	293	debug("ConnMan.put(): closing proxy connection (keep-alive not yet"
	294	" supported)")
261	295	return
262	296
263	297	if conn.counter >= ConnMan.conn_max_counter:

271	305	debug("ConnMan.put(): closing connection (connection pooling disabled)")
272	306	return
273	307
	308	# Update timestamp of conn to record when was its last use
	309	conn.last_used_time = time()
	310
274	311	ConnMan.conn_pool_sem.acquire()
275	312	ConnMan.conn_pool[conn.id].append(conn)
276	313	ConnMan.conn_pool_sem.release()
277		debug("ConnMan.put(): connection put back to pool (%s#%d)" % (conn.id, conn.counter))
	314	debug("ConnMan.put(): connection put back to pool (%s#%d)"
	315	% (conn.id, conn.counter))
278	316
279	317	@staticmethod
280	318	def close(conn):

+13

-33

S3/Crypto.py less more

9	9
10	10	import sys
11	11	import hmac
12		import base64
	12	try:
	13	from base64 import encodebytes as encodestring
	14	except ImportError:
	15	# Python 2 support
	16	from base64 import encodestring
13	17
14	18	from . import Config
15	19	from logging import debug
16		from .Utils import encode_to_s3, time_to_epoch, deunicodise, decode_from_s3, check_bucket_name_dns_support
	20	from .BaseUtils import encode_to_s3, decode_from_s3, s3_quote
	21	from .Utils import time_to_epoch, deunicodise, check_bucket_name_dns_support
17	22	from .SortedDict import SortedDict
18	23
19	24	import datetime
20		try:
21		# python 3 support
22		from urllib import quote
23		except ImportError:
24		from urllib.parse import quote
	25
25	26
26	27	from hashlib import sha1, sha256
27	28

65	66	and returned signature will be utf-8 encoded "bytes".
66	67	"""
67	68	secret_key = Config.Config().secret_key
68		signature = base64.encodestring(hmac.new(encode_to_s3(secret_key), string_to_sign, sha1).digest()).strip()
	69	signature = encodestring(hmac.new(encode_to_s3(secret_key), string_to_sign, sha1).digest()).strip()
69	70	return signature
70	71	__all__.append("sign_string_v2")
71	72

249	250	return new_headers
250	251	__all__.append("sign_request_v4")
251	252
252		def s3_quote(param, quote_backslashes=True, unicode_output=False):
253		"""
254		URI encode every byte. UriEncode() must enforce the following rules:
255		- URI encode every byte except the unreserved characters: 'A'-'Z', 'a'-'z', '0'-'9', '-', '.', '_', and '~'.
256		- The space character is a reserved character and must be encoded as "%20" (and not as "+").
257		- Each URI encoded byte is formed by a '%' and the two-digit hexadecimal value of the byte.
258		- Letters in the hexadecimal value must be uppercase, for example "%1A".
259		- Encode the forward slash character, '/', everywhere except in the object key name.
260		For example, if the object key name is photos/Jan/sample.jpg, the forward slash in the key name is not encoded.
261		"""
262		if quote_backslashes:
263		safe_chars = "~"
264		else:
265		safe_chars = "~/"
266		param = encode_to_s3(param)
267		param = quote(param, safe=safe_chars)
268		if unicode_output:
269		param = decode_from_s3(param)
270		else:
271		param = encode_to_s3(param)
272		return param
273		__all__.append("s3_quote")
274
275	253	def checksum_sha256_file(filename, offset=0, size=None):
276	254	try:
277	255	hash = sha256()
278		except:
	256	except Exception:
279	257	# fallback to Crypto SHA256 module
280	258	hash = sha256.new()
281	259	with open(deunicodise(filename),'rb') as f:

287	265	size_left = size
288	266	while size_left > 0:
289	267	chunk = f.read(min(8192, size_left))
	268	if not chunk:
	269	break
290	270	size_left -= len(chunk)
291	271	hash.update(chunk)
292	272

295	275	def checksum_sha256_buffer(buffer, offset=0, size=None):
296	276	try:
297	277	hash = sha256()
298		except:
	278	except Exception:
299	279	# fallback to Crypto SHA256 module
300	280	hash = sha256.new()
301	281	if size is None:

-1

S3/Custom_httplib27.py less more

11	11	except ImportError:
12	12	from StringIO import StringIO
13	13
14		from .Utils import encode_to_s3
	14	from .BaseUtils import encode_to_s3
15	15
16	16
17	17	_METHODS_EXPECTING_BODY = ['PATCH', 'POST', 'PUT']

-1

S3/Custom_httplib3x.py less more

10	10
11	11	from io import StringIO
12	12
13		from .Utils import encode_to_s3
	13	from .BaseUtils import encode_to_s3
14	14
15	15
16	16	_METHODS_EXPECTING_BODY = ['PATCH', 'POST', 'PUT']

-7

S3/Exceptions.py less more

9	9
10	10	from logging import debug, error
11	11	import sys
	12	import S3.BaseUtils
12	13	import S3.Utils
13	14	from . import ExitCodes
14	15

40	41	## s3cmd exceptions
41	42
42	43	class S3Exception(Exception):
43		def __init__(self, message = ""):
	44	def __init__(self, message=""):
44	45	self.message = S3.Utils.unicodise(message)
45	46
46	47	def __str__(self):

57	58	## (Base)Exception.message has been deprecated in Python 2.6
58	59	def _get_message(self):
59	60	return self._message
	61
60	62	def _set_message(self, message):
61	63	self._message = message
62	64	message = property(_get_message, _set_message)

67	69	self.status = response["status"]
68	70	self.reason = response["reason"]
69	71	self.info = {
70		"Code" : "",
71		"Message" : "",
72		"Resource" : ""
	72	"Code": "",
	73	"Message": "",
	74	"Resource": ""
73	75	}
74	76	debug("S3Error: %s (%s)" % (self.status, self.reason))
75	77	if "headers" in response:

77	79	debug("HttpHeader: %s: %s" % (header, response["headers"][header]))
78	80	if "data" in response and response["data"]:
79	81	try:
80		tree = S3.Utils.getTreeFromXml(response["data"])
	82	tree = S3.BaseUtils.getTreeFromXml(response["data"])
81	83	except XmlParseError:
82	84	debug("Not an XML response")
83	85	else:

113	115	return ExitCodes.EX_PRECONDITION
114	116	elif self.status == 500:
115	117	return ExitCodes.EX_SOFTWARE
116		elif self.status == 503:
	118	elif self.status in [429, 503]:
117	119	return ExitCodes.EX_SERVICE
118	120	else:
119	121	return ExitCodes.EX_SOFTWARE

125	127	if not error_node.tag == "Error":
126	128	error_node = tree.find(".//Error")
127	129	if error_node is not None:
128		for child in error_node.getchildren():
	130	for child in error_node:
129	131	if child.text != "":
130	132	debug("ErrorXML: " + child.tag + ": " + repr(child.text))
131	133	info[child.tag] = child.text

+11

-11

S3/FileLists.py less more

11	11	from .Config import Config
12	12	from .S3Uri import S3Uri
13	13	from .FileDict import FileDict
14		from .Utils import *
	14	from .BaseUtils import dateS3toUnix, dateRFC822toUnix
	15	from .Utils import unicodise, deunicodise, deunicodise_s, replace_nonprintables
15	16	from .Exceptions import ParameterError
16	17	from .HashCache import HashCache
17	18

34	35	'''
35	36	try:
36	37	names = os.listdir(deunicodise(top))
37		except:
	38	except Exception:
38	39	return
39	40
40	41	dirs, nondirs = [], []

183	184
184	185	# reformat to match os.walk()
185	186	result = []
186		keys = filelist.keys()
187		keys.sort()
188		for key in keys:
	187	for key in sorted(filelist):
189	188	values = filelist[key]
190	189	values.sort()
191	190	result.append((key, [], values))

245	244	try:
246	245	uid = os.geteuid()
247	246	gid = os.getegid()
248		except:
	247	except Exception:
249	248	uid = 0
250	249	gid = 0
251	250	loc_list["-"] = {

372	371	remote_item.update({
373	372	'size': int(response['headers']['content-length']),
374	373	'md5': response['headers']['etag'].strip('"\''),
375		'timestamp' : dateRFC822toUnix(response['headers']['last-modified'])
	374	'timestamp': dateRFC822toUnix(response['headers']['last-modified'])
376	375	})
377	376	try:
378	377	md5 = response['s3cmd-attrs']['md5']

541	540	try:
542	541	src_md5 = src_list.get_md5(file)
543	542	dst_md5 = dst_list.get_md5(file)
544		except (IOError,OSError):
	543	except (IOError, OSError):
545	544	# md5 sum verification failed - ignore that file altogether
546	545	debug(u"IGNR: %s (disappeared)" % (file))
547	546	warning(u"%s: file disappeared, ignoring." % (file))

603	602	# Found one, we want to copy
604	603	dst1 = dst_list.find_md5_one(md5)
605	604	debug(u"DST COPY src: %s -> %s" % (dst1, relative_file))
606		copy_pairs.append((src_list[relative_file], dst1, relative_file))
	605	copy_pairs.append((src_list[relative_file], dst1, relative_file, md5))
607	606	del(src_list[relative_file])
608	607	del(dst_list[relative_file])
609	608	else:

620	619	try:
621	620	md5 = src_list.get_md5(relative_file)
622	621	except IOError:
623		md5 = None
	622	md5 = None
624	623	dst1 = dst_list.find_md5_one(md5)
625	624	if dst1 is not None:
626	625	# Found one, we want to copy
627	626	debug(u"DST COPY dst: %s -> %s" % (dst1, relative_file))
628		copy_pairs.append((src_list[relative_file], dst1, relative_file))
	627	copy_pairs.append((src_list[relative_file], dst1,
	628	relative_file, md5))
629	629	del(src_list[relative_file])
630	630	else:
631	631	# we don't have this file, and we don't have a copy of this file elsewhere. Get it.

-1

S3/HashCache.py less more

25	25	d = self.inodes[dev][inode][mtime]
26	26	if d['size'] != size:
27	27	return None
28		except:
	28	except Exception:
29	29	return None
30	30	return d['md5']
31	31

+200

-87

S3/MultiPart.py less more

5	5
6	6	from __future__ import absolute_import
7	7
8		import os
9	8	import sys
10		from stat import ST_SIZE
11	9	from logging import debug, info, warning, error
12		from .Utils import getTextFromXml, getTreeFromXml, formatSize, unicodise, deunicodise, calculateChecksum, parseNodes, encode_to_s3
	10	from .Exceptions import ParameterError
	11	from .S3Uri import S3UriS3
	12	from .BaseUtils import getTextFromXml, getTreeFromXml, s3_quote, parseNodes
	13	from .Utils import formatSize, calculateChecksum
	14
	15	SIZE_1MB = 1024 * 1024
	16
13	17
14	18	class MultiPartUpload(object):
15
16		MIN_CHUNK_SIZE_MB = 5 # 5MB
17		MAX_CHUNK_SIZE_MB = 5120 # 5GB
18		MAX_FILE_SIZE = 42949672960 # 5TB
19
20		def __init__(self, s3, file_stream, uri, headers_baseline=None):
	19	"""Supports MultiPartUpload and MultiPartUpload(Copy) operation"""
	20	MIN_CHUNK_SIZE_MB = 5 # 5MB
	21	MAX_CHUNK_SIZE_MB = 5 * 1024 # 5GB
	22	MAX_FILE_SIZE = 5 * 1024 * 1024 # 5TB
	23
	24	def __init__(self, s3, src, dst_uri, headers_baseline=None,
	25	src_size=None):
21	26	self.s3 = s3
22		self.file_stream = file_stream
23		self.uri = uri
	27	self.file_stream = None
	28	self.src_uri = None
	29	self.src_size = src_size
	30	self.dst_uri = dst_uri
24	31	self.parts = {}
25	32	self.headers_baseline = headers_baseline or {}
	33
	34	if isinstance(src, S3UriS3):
	35	# Source is the uri of an object to s3-to-s3 copy with multipart.
	36	self.src_uri = src
	37	if not src_size:
	38	raise ParameterError("Source size is missing for "
	39	"MultipartUploadCopy operation")
	40	c_size = self.s3.config.multipart_copy_chunk_size_mb * SIZE_1MB
	41	else:
	42	# Source is a file_stream to upload
	43	self.file_stream = src
	44	c_size = self.s3.config.multipart_chunk_size_mb * SIZE_1MB
	45
	46	self.chunk_size = c_size
26	47	self.upload_id = self.initiate_multipart_upload()
27	48
28	49	def get_parts_information(self, uri, upload_id):
29		multipart_response = self.s3.list_multipart(uri, upload_id)
30		tree = getTreeFromXml(multipart_response['data'])
	50	part_list = self.s3.list_multipart(uri, upload_id)
31	51
32	52	parts = dict()
33		for elem in parseNodes(tree):
	53	for elem in part_list:
34	54	try:
35		parts[int(elem['PartNumber'])] = {'checksum': elem['ETag'], 'size': elem['Size']}
	55	parts[int(elem['PartNumber'])] = {
	56	'checksum': elem['ETag'],
	57	'size': elem['Size']
	58	}
36	59	except KeyError:
37	60	pass
38	61

40	63
41	64	def get_unique_upload_id(self, uri):
42	65	upload_id = ""
43		multipart_response = self.s3.get_multipart(uri)
44		tree = getTreeFromXml(multipart_response['data'])
45		for mpupload in parseNodes(tree):
	66	multipart_list = self.s3.get_multipart(uri)
	67	for mpupload in multipart_list:
46	68	try:
47	69	mp_upload_id = mpupload['UploadId']
48	70	mp_path = mpupload['Key']
49	71	info("mp_path: %s, object: %s" % (mp_path, uri.object()))
50	72	if mp_path == uri.object():
51	73	if upload_id:
52		raise ValueError("More than one UploadId for URI %s. Disable multipart upload, or use\n %s multipart %s\nto list the Ids, then pass a unique --upload-id into the put command." % (uri, sys.argv[0], uri))
	74	raise ValueError(
	75	"More than one UploadId for URI %s. Disable "
	76	"multipart upload, or use\n %s multipart %s\n"
	77	"to list the Ids, then pass a unique --upload-id "
	78	"into the put command." % (uri, sys.argv[0], uri))
53	79	upload_id = mp_upload_id
54	80	except KeyError:
55	81	pass

64	90	if self.s3.config.upload_id:
65	91	self.upload_id = self.s3.config.upload_id
66	92	elif self.s3.config.put_continue:
67		self.upload_id = self.get_unique_upload_id(self.uri)
	93	self.upload_id = self.get_unique_upload_id(self.dst_uri)
68	94	else:
69	95	self.upload_id = ""
70	96
71	97	if not self.upload_id:
72		request = self.s3.create_request("OBJECT_POST", uri = self.uri,
73		headers = self.headers_baseline,
74		uri_params = {'uploads': None})
	98	request = self.s3.create_request("OBJECT_POST", uri=self.dst_uri,
	99	headers=self.headers_baseline,
	100	uri_params={'uploads': None})
75	101	response = self.s3.send_request(request)
76	102	data = response["data"]
77	103	self.upload_id = getTextFromXml(data, "UploadId")

85	111	TODO use num_processes to thread it
86	112	"""
87	113	if not self.upload_id:
88		raise RuntimeError("Attempting to use a multipart upload that has not been initiated.")
89
90		self.chunk_size = self.s3.config.multipart_chunk_size_mb * 1024 * 1024
91		filename = self.file_stream.stream_name
92
93		if filename != u"<stdin>":
94		size_left = file_size = os.stat(deunicodise(filename))[ST_SIZE]
95		nr_parts = file_size // self.chunk_size + (file_size % self.chunk_size and 1)
96		debug("MultiPart: Uploading %s in %d parts" % (filename, nr_parts))
	114	raise ParameterError("Attempting to use a multipart upload that "
	115	"has not been initiated.")
	116
	117	remote_statuses = {}
	118
	119	if self.src_uri:
	120	filename = self.src_uri.uri()
	121	# Continue is not possible with multipart copy
97	122	else:
98		debug("MultiPart: Uploading from %s" % filename)
99
100		remote_statuses = dict()
	123	filename = self.file_stream.stream_name
	124
101	125	if self.s3.config.put_continue:
102		remote_statuses = self.get_parts_information(self.uri, self.upload_id)
	126	remote_statuses = self.get_parts_information(self.dst_uri,
	127	self.upload_id)
103	128
104	129	if extra_label:
105	130	extra_label = u' ' + extra_label
	131	labels = {
	132	'source': filename,
	133	'destination': self.dst_uri.uri(),
	134	}
	135
106	136	seq = 1
107		if filename != u"<stdin>":
	137
	138	if self.src_size:
	139	size_left = self.src_size
	140	nr_parts = self.src_size // self.chunk_size \
	141	+ (self.src_size % self.chunk_size and 1)
	142	debug("MultiPart: Uploading %s in %d parts" % (filename, nr_parts))
	143
108	144	while size_left > 0:
109	145	offset = self.chunk_size * (seq - 1)
110		current_chunk_size = min(file_size - offset, self.chunk_size)
	146	current_chunk_size = min(self.src_size - offset,
	147	self.chunk_size)
111	148	size_left -= current_chunk_size
112		labels = {
113		'source' : filename,
114		'destination' : self.uri.uri(),
115		'extra' : "[part %d of %d, %s]%s" % (seq, nr_parts, "%d%sB" % formatSize(current_chunk_size, human_readable = True), extra_label)
116		}
	149	labels['extra'] = "[part %d of %d, %s]%s" % (
	150	seq, nr_parts, "%d%sB" % formatSize(current_chunk_size,
	151	human_readable=True),
	152	extra_label)
117	153	try:
118		self.upload_part(seq, offset, current_chunk_size, labels, remote_status = remote_statuses.get(seq))
	154	if self.file_stream:
	155	self.upload_part(
	156	seq, offset, current_chunk_size, labels,
	157	remote_status=remote_statuses.get(seq))
	158	else:
	159	self.copy_part(
	160	seq, offset, current_chunk_size, labels,
	161	remote_status=remote_statuses.get(seq))
119	162	except:
120		error(u"\nUpload of '%s' part %d failed. Use\n %s abortmp %s %s\nto abort the upload, or\n %s --upload-id %s put ...\nto continue the upload."
121		% (filename, seq, sys.argv[0], self.uri, self.upload_id, sys.argv[0], self.upload_id))
	163	error(u"\nUpload of '%s' part %d failed. Use\n "
	164	"%s abortmp %s %s\nto abort the upload, or\n "
	165	"%s --upload-id %s put ...\nto continue the upload."
	166	% (filename, seq, sys.argv[0], self.dst_uri,
	167	self.upload_id, sys.argv[0], self.upload_id))
122	168	raise
123	169	seq += 1
124		else:
125		while True:
126		buffer = self.file_stream.read(self.chunk_size)
127		offset = 0 # send from start of the buffer
128		current_chunk_size = len(buffer)
129		labels = {
130		'source' : filename,
131		'destination' : self.uri.uri(),
132		'extra' : "[part %d, %s]" % (seq, "%d%sB" % formatSize(current_chunk_size, human_readable = True))
133		}
134		if len(buffer) == 0: # EOF
135		break
136		try:
137		self.upload_part(seq, offset, current_chunk_size, labels, buffer, remote_status = remote_statuses.get(seq))
138		except:
139		error(u"\nUpload of '%s' part %d failed. Use\n %s abortmp %s %s\nto abort, or\n %s --upload-id %s put ...\nto continue the upload."
140		% (filename, seq, sys.argv[0], self.uri, self.upload_id, sys.argv[0], self.upload_id))
141		raise
142		seq += 1
	170
	171	debug("MultiPart: Upload finished: %d parts", seq - 1)
	172	return
	173
	174
	175	# Else -> Case of u"<stdin>" source
	176	debug("MultiPart: Uploading from %s" % filename)
	177	while True:
	178	buffer = self.file_stream.read(self.chunk_size)
	179	offset = 0 # send from start of the buffer
	180	current_chunk_size = len(buffer)
	181	labels['extra'] = "[part %d of -, %s]%s" % (
	182	seq, "%d%sB" % formatSize(current_chunk_size,
	183	human_readable=True),
	184	extra_label)
	185	if not buffer:
	186	# EOF
	187	break
	188	try:
	189	self.upload_part(seq, offset, current_chunk_size, labels,
	190	buffer,
	191	remote_status=remote_statuses.get(seq))
	192	except:
	193	error(u"\nUpload of '%s' part %d failed. Use\n "
	194	"%s abortmp %s %s\nto abort, or\n "
	195	"%s --upload-id %s put ...\nto continue the upload."
	196	% (filename, seq, sys.argv[0], self.dst_uri,
	197	self.upload_id, sys.argv[0], self.upload_id))
	198	raise
	199	seq += 1
143	200
144	201	debug("MultiPart: Upload finished: %d parts", seq - 1)
145	202
146		def upload_part(self, seq, offset, chunk_size, labels, buffer = '', remote_status = None):
	203	def upload_part(self, seq, offset, chunk_size, labels, buffer='',
	204	remote_status=None):
147	205	"""
148	206	Upload a file chunk
149	207	http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadUploadPart.html
150	208	"""
151	209	# TODO implement Content-MD5
152		debug("Uploading part %i of %r (%s bytes)" % (seq, self.upload_id, chunk_size))
	210	debug("Uploading part %i of %r (%s bytes)" % (seq, self.upload_id,
	211	chunk_size))
153	212
154	213	if remote_status is not None:
155	214	if int(remote_status['size']) == chunk_size:
156		checksum = calculateChecksum(buffer, self.file_stream, offset, chunk_size, self.s3.config.send_chunk)
	215	checksum = calculateChecksum(buffer, self.file_stream, offset,
	216	chunk_size,
	217	self.s3.config.send_chunk)
157	218	remote_checksum = remote_status['checksum'].strip('"\'')
158	219	if remote_checksum == checksum:
159		warning("MultiPart: size and md5sum match for %s part %d, skipping." % (self.uri, seq))
	220	warning("MultiPart: size and md5sum match for %s part %d, "
	221	"skipping." % (self.dst_uri, seq))
160	222	self.parts[seq] = remote_status['checksum']
161		return
	223	return None
162	224	else:
163		warning("MultiPart: checksum (%s vs %s) does not match for %s part %d, reuploading."
164		% (remote_checksum, checksum, self.uri, seq))
	225	warning("MultiPart: checksum (%s vs %s) does not match for"
	226	" %s part %d, reuploading."
	227	% (remote_checksum, checksum, self.dst_uri, seq))
165	228	else:
166		warning("MultiPart: size (%d vs %d) does not match for %s part %d, reuploading."
167		% (int(remote_status['size']), chunk_size, self.uri, seq))
168
169		headers = { "content-length": str(chunk_size) }
170		query_string_params = {'partNumber':'%s' % seq,
	229	warning("MultiPart: size (%d vs %d) does not match for %s part"
	230	" %d, reuploading." % (int(remote_status['size']),
	231	chunk_size, self.dst_uri, seq))
	232
	233	headers = {"content-length": str(chunk_size)}
	234	query_string_params = {'partNumber': '%s' % seq,
171	235	'uploadId': self.upload_id}
172		request = self.s3.create_request("OBJECT_PUT", uri = self.uri,
173		headers = headers,
174		uri_params = query_string_params)
175		response = self.s3.send_file(request, self.file_stream, labels, buffer, offset = offset, chunk_size = chunk_size)
	236	request = self.s3.create_request("OBJECT_PUT", uri=self.dst_uri,
	237	headers=headers,
	238	uri_params=query_string_params)
	239	response = self.s3.send_file(request, self.file_stream, labels, buffer,
	240	offset=offset, chunk_size=chunk_size)
176	241	self.parts[seq] = response["headers"].get('etag', '').strip('"\'')
	242	return response
	243
	244	def copy_part(self, seq, offset, chunk_size, labels, remote_status=None):
	245	"""
	246	Copy a remote file chunk
	247	http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadUploadPart.html
	248	http://docs.amazonwebservices.com/AmazonS3/latest/API/mpUploadUploadPartCopy.html
	249	"""
	250	debug("Copying part %i of %r (%s bytes)" % (seq, self.upload_id,
	251	chunk_size))
	252
	253	# set up headers with copy-params.
	254	# Examples:
	255	# x-amz-copy-source: /source_bucket/sourceObject
	256	# x-amz-copy-source-range:bytes=first-last
	257	# x-amz-copy-source-if-match: etag
	258	# x-amz-copy-source-if-none-match: etag
	259	# x-amz-copy-source-if-unmodified-since: time_stamp
	260	# x-amz-copy-source-if-modified-since: time_stamp
	261	headers = {
	262	"x-amz-copy-source": s3_quote("/%s/%s" % (self.src_uri.bucket(),
	263	self.src_uri.object()),
	264	quote_backslashes=False,
	265	unicode_output=True)
	266	}
	267
	268	# byte range, with end byte included. A 10 byte file has bytes=0-9
	269	headers["x-amz-copy-source-range"] = \
	270	"bytes=%d-%d" % (offset, (offset + chunk_size - 1))
	271
	272	query_string_params = {'partNumber': '%s' % seq,
	273	'uploadId': self.upload_id}
	274	request = self.s3.create_request("OBJECT_PUT", uri=self.dst_uri,
	275	headers=headers,
	276	uri_params=query_string_params)
	277
	278	labels[u'action'] = u'remote copy'
	279	response = self.s3.send_request_with_progress(request, labels,
	280	chunk_size)
	281
	282	# NOTE: Amazon sends whitespace while upload progresses, which
	283	# accumulates in response body and seems to confuse XML parser.
	284	# Strip newlines to find ETag in XML response data
	285	#data = response["data"].replace("\n", '')
	286	self.parts[seq] = getTextFromXml(response['data'], "ETag") or ''
	287
177	288	return response
178	289
179	290	def complete_multipart_upload(self):

187	298	part_xml = "<Part><PartNumber>%i</PartNumber><ETag>%s</ETag></Part>"
188	299	for seq, etag in self.parts.items():
189	300	parts_xml.append(part_xml % (seq, etag))
190		body = "<CompleteMultipartUpload>%s</CompleteMultipartUpload>" % ("".join(parts_xml))
191
192		headers = { "content-length": str(len(body)) }
193		request = self.s3.create_request("OBJECT_POST", uri = self.uri,
194		headers = headers, body = body,
195		uri_params = {'uploadId': self.upload_id})
	301	body = "<CompleteMultipartUpload>%s</CompleteMultipartUpload>" \
	302	% "".join(parts_xml)
	303
	304	headers = {"content-length": str(len(body))}
	305	request = self.s3.create_request(
	306	"OBJECT_POST", uri=self.dst_uri, headers=headers, body=body,
	307	uri_params={'uploadId': self.upload_id})
196	308	response = self.s3.send_request(request)
197	309
198	310	return response

209	321	response = None
210	322	return response
211	323
	324
212	325	# vim:et:ts=4:sts=4:ai

-1

S3/PkgInfo.py less more

6	6	## Copyright: TGRMN Software and contributors
7	7
8	8	package = "s3cmd"
9		version = "2.1.0"
	9	version = "2.2.0"
10	10	url = "http://s3tools.org"
11	11	license = "GNU GPL v2+"
12	12	short_description = "Command line tool for managing Amazon S3 and CloudFront services"

+374

-132

S3/S3.py less more

11	11	import os
12	12	import time
13	13	import errno
14		import base64
15	14	import mimetypes
16	15	import io
17	16	import pprint

24	23	from urlparse import urlparse
25	24	except ImportError:
26	25	from urllib.parse import urlparse
	26	try:
	27	# Python 2 support
	28	from base64 import encodestring
	29	except ImportError:
	30	# Python 3.9.0+ support
	31	from base64 import encodebytes as encodestring
27	32
28	33	import select
29	34

32	37	except ImportError:
33	38	from md5 import md5
34	39
35		from .Utils import *
	40	from .BaseUtils import (getListFromXml, getTextFromXml, getRootTagName,
	41	decode_from_s3, encode_to_s3, s3_quote)
	42	from .Utils import (convertHeaderTupleListToDict, hash_file_md5, unicodise,
	43	deunicodise, check_bucket_name,
	44	check_bucket_name_dns_support, getHostnameFromBucket,
	45	calculateChecksum)
36	46	from .SortedDict import SortedDict
37	47	from .AccessLog import AccessLog
38	48	from .ACL import ACL, GranteeLogDelivery

43	53	from .S3Uri import S3Uri
44	54	from .ConnMan import ConnMan
45	55	from .Crypto import (sign_request_v2, sign_request_v4, checksum_sha256_file,
46		checksum_sha256_buffer, s3_quote, format_param_str)
	56	checksum_sha256_buffer, format_param_str)
47	57
48	58	try:
49	59	from ctypes import ArgumentError

122	132	result = (None, None)
123	133	return result
124	134
	135
125	136	EXPECT_CONTINUE_TIMEOUT = 2
126
	137	SIZE_1MB = 1024 * 1024
127	138
128	139	__all__ = []
	140
129	141	class S3Request(object):
130	142	region_map = {}
131	143	## S3 sometimes sends HTTP-301, HTTP-307 response

351	363	num_prefixes = 0
352	364	max_keys = limit
353	365	while truncated:
354		response = self.bucket_list_noparse(bucket, prefix, recursive, uri_params, max_keys)
	366	response = self.bucket_list_noparse(bucket, prefix, recursive,
	367	uri_params, max_keys)
355	368	current_list = _get_contents(response["data"])
356	369	current_prefixes = _get_common_prefixes(response["data"])
357	370	num_objects += len(current_list)

362	375	if truncated:
363	376	if limit == -1 or num_objects + num_prefixes < limit:
364	377	if current_list:
365		uri_params['marker'] = _get_next_marker(response["data"], current_list)
	378	uri_params['marker'] = \
	379	_get_next_marker(response["data"], current_list)
	380	elif current_prefixes:
	381	uri_params['marker'] = current_prefixes[-1]["Prefix"]
366	382	else:
367		uri_params['marker'] = current_prefixes[-1]["Prefix"]
	383	# Unexpectedly, the server lied, and so the previous
	384	# response was not truncated. So, no new key to get.
	385	yield False, current_prefixes, current_list
	386	break
368	387	debug("Listing continues after '%s'" % uri_params['marker'])
369	388	else:
370	389	yield truncated, current_prefixes, current_list

386	405	#debug(response)
387	406	return response
388	407
389		def bucket_create(self, bucket, bucket_location = None):
	408	def bucket_create(self, bucket, bucket_location = None, extra_headers = None):
390	409	headers = SortedDict(ignore_case = True)
	410	if extra_headers:
	411	headers.update(extra_headers)
	412
391	413	body = ""
392	414	if bucket_location and bucket_location.strip().upper() != "US" and bucket_location.strip().lower() != "us-east-1":
393	415	bucket_location = bucket_location.strip()

446	468	return location
447	469
448	470	def get_bucket_requester_pays(self, uri):
449		request = self.create_request("BUCKET_LIST", bucket = uri.bucket(),
450		uri_params = {'requestPayment': None})
451		response = self.send_request(request)
452		payer = getTextFromXml(response['data'], "Payer")
	471	request = self.create_request("BUCKET_LIST", bucket=uri.bucket(),
	472	uri_params={'requestPayment': None})
	473	response = self.send_request(request)
	474	resp_data = response.get('data', '')
	475	if resp_data:
	476	payer = getTextFromXml(response['data'], "Payer")
	477	else:
	478	payer = None
453	479	return payer
454	480
455	481	def bucket_info(self, uri):

458	484	try:
459	485	response['requester-pays'] = self.get_bucket_requester_pays(uri)
460	486	except S3Error as e:
461		response['requester-pays'] = 'none'
	487	response['requester-pays'] = None
462	488	return response
463	489
464	490	def website_info(self, uri, bucket_location = None):

515	541	def expiration_info(self, uri, bucket_location = None):
516	542	bucket = uri.bucket()
517	543
518		request = self.create_request("BUCKET_LIST", bucket = bucket,
519		uri_params = {'lifecycle': None})
	544	request = self.create_request("BUCKET_LIST", bucket=bucket,
	545	uri_params={'lifecycle': None})
520	546	try:
521	547	response = self.send_request(request)
522		response['prefix'] = getTextFromXml(response['data'], ".//Rule//Prefix")
523		response['date'] = getTextFromXml(response['data'], ".//Rule//Expiration//Date")
524		response['days'] = getTextFromXml(response['data'], ".//Rule//Expiration//Days")
525		return response
526	548	except S3Error as e:
527	549	if e.status == 404:
528		debug("Could not get /?lifecycle - lifecycle probably not configured for this bucket")
	550	debug("Could not get /?lifecycle - lifecycle probably not "
	551	"configured for this bucket")
529	552	return None
530	553	elif e.status == 501:
531		debug("Could not get /?lifecycle - lifecycle support not implemented by the server")
	554	debug("Could not get /?lifecycle - lifecycle support not "
	555	"implemented by the server")
532	556	return None
533	557	raise
	558
	559	root_tag_name = getRootTagName(response['data'])
	560	if root_tag_name != "LifecycleConfiguration":
	561	debug("Could not get /?lifecycle - unexpected xml response: "
	562	"%s", root_tag_name)
	563	return None
	564	response['prefix'] = getTextFromXml(response['data'],
	565	".//Rule//Prefix")
	566	response['date'] = getTextFromXml(response['data'],
	567	".//Rule//Expiration//Date")
	568	response['days'] = getTextFromXml(response['data'],
	569	".//Rule//Expiration//Days")
	570	return response
534	571
535	572	def expiration_set(self, uri, bucket_location = None):
536	573	if self.config.expiry_date and self.config.expiry_days:

676	713	if not self.config.enable_multipart and filename == "-":
677	714	raise ParameterError("Multi-part upload is required to upload from stdin")
678	715	if self.config.enable_multipart:
679		if size > self.config.multipart_chunk_size_mb * 1024 * 1024 or filename == "-":
	716	if size > self.config.multipart_chunk_size_mb * SIZE_1MB or filename == "-":
680	717	multipart = True
681		if size > self.config.multipart_max_chunks * self.config.multipart_chunk_size_mb * 1024 * 1024:
	718	if size > self.config.multipart_max_chunks * self.config.multipart_chunk_size_mb * SIZE_1MB:
682	719	raise ParameterError("Chunk size %d MB results in more than %d chunks. Please increase --multipart-chunk-size-mb" % \
683	720	(self.config.multipart_chunk_size_mb, self.config.multipart_max_chunks))
684	721	if multipart:

693	730	# an md5.
694	731	try:
695	732	info = self.object_info(uri)
696		except:
	733	except Exception:
697	734	info = None
698	735
699	736	if info is not None:

775	812	raise ParameterError("You must restore a file for 1 or more days")
776	813	if self.config.restore_priority not in ['Standard', 'Expedited', 'Bulk']:
777	814	raise ParameterError("Valid restoration priorities: bulk, standard, expedited")
778		body = '<RestoreRequest xmlns="http://s3.amazonaws.com/doc/2006-3-01">'
	815	body = '<RestoreRequest xmlns="http://s3.amazonaws.com/doc/2006-03-01/">'
779	816	body += (' <Days>%s</Days>' % self.config.restore_days)
780	817	body += ' <GlacierJobParameters>'
781	818	body += (' <Tier>%s</Tier>' % self.config.restore_priority)

803	840	'server',
804	841	'x-amz-id-2',
805	842	'x-amz-request-id',
	843	# Other headers that are not copying by a direct copy
	844	'x-amz-storage-class',
	845	## We should probably also add server-side encryption headers
806	846	]
807	847
808	848	for h in to_remove + self.config.remove_headers:

810	850	del headers[h.lower()]
811	851	return headers
812	852
813		def object_copy(self, src_uri, dst_uri, extra_headers = None):
	853	def object_copy(self, src_uri, dst_uri, extra_headers=None,
	854	src_size=None, extra_label="", replace_meta=False):
	855	"""Remote copy an object and eventually set metadata
	856
	857	Note: A little memo description of the nightmare for performance here:
	858	** FOR AWS, 2 cases:
	859	- COPY will copy the metadata of the source to dest, but you can't
	860	modify them. Any additional header will be ignored anyway.
	861	- REPLACE will set the additional metadata headers that are provided
	862	but will not copy any of the source headers.
	863	So, to add to existing meta during copy, you have to do an object_info
	864	to get original source headers, then modify, then use REPLACE for the
	865	copy operation.
	866
	867	** For Minio and maybe other implementations:
	868	- if additional headers are sent, they will be set to the destination
	869	on top of source original meta in all cases COPY and REPLACE.
	870	It is a nice behavior except that it is different of the aws one.
	871
	872	As it was still too easy, there is another catch:
	873	In all cases, for multipart copies, metadata data are never copied
	874	from the source.
	875	"""
814	876	if src_uri.type != "s3":
815	877	raise ValueError("Expected URI type 's3', got '%s'" % src_uri.type)
816	878	if dst_uri.type != "s3":

824	886	if exc.status != 501:
825	887	raise exc
826	888	acl = None
827		headers = SortedDict(ignore_case = True)
828		headers['x-amz-copy-source'] = "/%s/%s" % (src_uri.bucket(),
829		urlencode_string(src_uri.object(), unicode_output=True))
830		headers['x-amz-metadata-directive'] = "COPY"
	889
	890	multipart = False
	891	headers = None
	892
	893	if extra_headers or self.config.mime_type:
	894	# Force replace, that will force getting meta with object_info()
	895	replace_meta = True
	896
	897	if replace_meta:
	898	src_info = self.object_info(src_uri)
	899	headers = src_info['headers']
	900	src_size = int(headers["content-length"])
	901
	902	if self.config.enable_multipart:
	903	# Get size of remote source only if multipart is enabled and that no
	904	# size info was provided
	905	src_headers = headers
	906	if src_size is None:
	907	src_info = self.object_info(src_uri)
	908	src_headers = src_info['headers']
	909	src_size = int(src_headers["content-length"])
	910
	911	# If we are over the grand maximum size for a normal copy/modify
	912	# (> 5GB) go nuclear and use multipart copy as the only option to
	913	# modify an object.
	914	# Reason is an aws s3 design bug. See:
	915	# https://github.com/aws/aws-sdk-java/issues/367
	916	if src_uri is dst_uri:
	917	# optimisation in the case of modify
	918	threshold = MultiPartUpload.MAX_CHUNK_SIZE_MB * SIZE_1MB
	919	else:
	920	threshold = self.config.multipart_copy_chunk_size_mb * SIZE_1MB
	921
	922	if src_size > threshold:
	923	# Sadly, s3 has a bad logic as metadata will not be copied for
	924	# multipart copy unlike what is done for direct copies.
	925	# TODO: Optimize by re-using the object_info request done
	926	# earlier earlier at fetch remote stage, and preserve headers.
	927	if src_headers is None:
	928	src_info = self.object_info(src_uri)
	929	src_headers = src_info['headers']
	930	src_size = int(src_headers["content-length"])
	931	headers = src_headers
	932	multipart = True
	933
	934	if headers:
	935	self._sanitize_headers(headers)
	936	headers = SortedDict(headers, ignore_case=True)
	937	else:
	938	headers = SortedDict(ignore_case=True)
	939
	940	# Following meta data are updated even in COPY by aws
831	941	if self.config.acl_public:
832	942	headers["x-amz-acl"] = "public-read"
833	943

840	950	## Set kms headers
841	951	if self.config.kms_key:
842	952	headers['x-amz-server-side-encryption'] = 'aws:kms'
843		headers['x-amz-server-side-encryption-aws-kms-key-id'] = self.config.kms_key
844
	953	headers['x-amz-server-side-encryption-aws-kms-key-id'] = \
	954	self.config.kms_key
	955
	956	# Following meta data are not updated in simple COPY by aws.
845	957	if extra_headers:
846	958	headers.update(extra_headers)
847	959
848		request = self.create_request("OBJECT_PUT", uri = dst_uri, headers = headers)
849		response = self.send_request(request)
	960	if self.config.mime_type:
	961	headers["content-type"] = self.config.mime_type
	962
	963	# "COPY" or "REPLACE"
	964	if not replace_meta:
	965	headers['x-amz-metadata-directive'] = "COPY"
	966	else:
	967	headers['x-amz-metadata-directive'] = "REPLACE"
	968
	969	if multipart:
	970	# Multipart decision. Only do multipart copy for remote s3 files
	971	# bigger than the multipart copy threshold.
	972
	973	# Multipart requests are quite different... delegate
	974	response = self.copy_file_multipart(src_uri, dst_uri, src_size,
	975	headers, extra_label)
	976	else:
	977	# Not multipart... direct request
	978	headers['x-amz-copy-source'] = s3_quote(
	979	"/%s/%s" % (src_uri.bucket(), src_uri.object()),
	980	quote_backslashes=False, unicode_output=True)
	981
	982	request = self.create_request("OBJECT_PUT", uri=dst_uri,
	983	headers=headers)
	984	response = self.send_request(request)
	985
850	986	if response["data"] and getRootTagName(response["data"]) == "Error":
851		#http://doc.s3.amazonaws.com/proposals/copy.html
	987	# http://doc.s3.amazonaws.com/proposals/copy.html
852	988	# Error during copy, status will be 200, so force error code 500
853	989	response["status"] = 500
854		error("Server error during the COPY operation. Overwrite response status to 500")
	990	error("Server error during the COPY operation. Overwrite response "
	991	"status to 500")
855	992	raise S3Error(response)
856	993
857	994	if self.config.acl_public is None and acl:

864	1001	raise exc
865	1002	return response
866	1003
867		def object_modify(self, src_uri, dst_uri, extra_headers = None):
868
869		if src_uri.type != "s3":
870		raise ValueError("Expected URI type 's3', got '%s'" % src_uri.type)
871		if dst_uri.type != "s3":
872		raise ValueError("Expected URI type 's3', got '%s'" % dst_uri.type)
873
874		info_response = self.object_info(src_uri)
875		headers = info_response['headers']
876		headers = self._sanitize_headers(headers)
877
878		try:
879		acl = self.get_acl(src_uri)
880		except S3Error as exc:
881		# Ignore the exception and don't fail the modify
882		# if the server doesn't support setting ACLs
883		if exc.status != 501:
884		raise exc
885		acl = None
886
887		headers['x-amz-copy-source'] = "/%s/%s" % (src_uri.bucket(),
888		urlencode_string(src_uri.object(), unicode_output=True))
889		headers['x-amz-metadata-directive'] = "REPLACE"
890
891		# cannot change between standard and reduced redundancy with a REPLACE.
892
893		## Set server side encryption
894		if self.config.server_side_encryption:
895		headers["x-amz-server-side-encryption"] = "AES256"
896
897		## Set kms headers
898		if self.config.kms_key:
899		headers['x-amz-server-side-encryption'] = 'aws:kms'
900		headers['x-amz-server-side-encryption-aws-kms-key-id'] = self.config.kms_key
901
902		if extra_headers:
903		headers.update(extra_headers)
904
905		if self.config.mime_type:
906		headers["content-type"] = self.config.mime_type
907
908		request = self.create_request("OBJECT_PUT", uri = src_uri, headers = headers)
909		response = self.send_request(request)
910		if response["data"] and getRootTagName(response["data"]) == "Error":
911		#http://doc.s3.amazonaws.com/proposals/copy.html
912		# Error during modify, status will be 200, so force error code 500
913		response["status"] = 500
914		error("Server error during the MODIFY operation. Overwrite response status to 500")
915		raise S3Error(response)
916
917		if acl != None:
918		try:
919		self.set_acl(src_uri, acl)
920		except S3Error as exc:
921		# Ignore the exception and don't fail the modify
922		# if the server doesn't support setting ACLs
923		if exc.status != 501:
924		raise exc
925
926		return response
927
928		def object_move(self, src_uri, dst_uri, extra_headers = None):
929		response_copy = self.object_copy(src_uri, dst_uri, extra_headers)
	1004	def object_modify(self, src_uri, dst_uri, extra_headers=None,
	1005	src_size=None, extra_label=""):
	1006	# dst_uri = src_uri Will optimize by using multipart just in worst case
	1007	return self.object_copy(src_uri, src_uri, extra_headers, src_size,
	1008	extra_label, replace_meta=True)
	1009
	1010	def object_move(self, src_uri, dst_uri, extra_headers=None,
	1011	src_size=None, extra_label=""):
	1012	response_copy = self.object_copy(src_uri, dst_uri, extra_headers,
	1013	src_size, extra_label)
930	1014	debug("Object %s copied to %s" % (src_uri, dst_uri))
931		if not response_copy["data"] or getRootTagName(response_copy["data"]) == "CopyObjectResult":
	1015	if not response_copy["data"] \
	1016	or getRootTagName(response_copy["data"]) \
	1017	in ["CopyObjectResult", "CompleteMultipartUploadResult"]:
932	1018	self.object_delete(src_uri)
933	1019	debug("Object '%s' deleted", src_uri)
934	1020	else:
935		debug("Object '%s' NOT deleted because of an unexepected response data content.", src_uri)
	1021	warning("Object '%s' NOT deleted because of an unexpected "
	1022	"response data content.", src_uri)
936	1023	return response_copy
937	1024
938	1025	def object_info(self, uri):
939		request = self.create_request("OBJECT_HEAD", uri = uri)
	1026	request = self.create_request("OBJECT_HEAD", uri=uri)
940	1027	response = self.send_request(request)
941	1028	return response
942	1029
943	1030	def get_acl(self, uri):
944	1031	if uri.has_object():
945		request = self.create_request("OBJECT_GET", uri = uri,
946		uri_params = {'acl': None})
	1032	request = self.create_request("OBJECT_GET", uri=uri,
	1033	uri_params={'acl': None})
947	1034	else:
948		request = self.create_request("BUCKET_LIST", bucket = uri.bucket(),
949		uri_params = {'acl': None})
	1035	request = self.create_request("BUCKET_LIST", bucket=uri.bucket(),
	1036	uri_params={'acl': None})
950	1037
951	1038	response = self.send_request(request)
952	1039	acl = ACL(response['data'])
953	1040	return acl
954	1041
955	1042	def set_acl(self, uri, acl):
956		# dreamhost doesn't support set_acl properly
957		if 'objects.dreamhost.com' in self.config.host_base:
958		return { 'status' : 501 } # not implemented
959
960	1043	body = u"%s"% acl
961	1044	debug(u"set_acl(%s): acl-xml: %s" % (uri, body))
962	1045

1060	1143	response = self.send_request(request)
1061	1144	return response
1062	1145
1063		def get_multipart(self, uri):
1064		request = self.create_request("BUCKET_LIST", bucket = uri.bucket(),
1065		uri_params = {'uploads': None})
	1146	def get_multipart(self, uri, uri_params=None, limit=-1):
	1147	upload_list = []
	1148	for truncated, uploads in self.get_multipart_streaming(uri,
	1149	uri_params,
	1150	limit):
	1151	upload_list.extend(uploads)
	1152
	1153	return upload_list
	1154
	1155	def get_multipart_streaming(self, uri, uri_params=None, limit=-1):
	1156	uri_params = uri_params and uri_params.copy() or {}
	1157	bucket = uri.bucket()
	1158
	1159	truncated = True
	1160	num_objects = 0
	1161	max_keys = limit
	1162
	1163	# It is the "uploads: None" in uri_params that will change the
	1164	# behavior of bucket_list to return multiparts instead of keys
	1165	uri_params['uploads'] = None
	1166	while truncated:
	1167	response = self.bucket_list_noparse(bucket, recursive=True,
	1168	uri_params=uri_params,
	1169	max_keys=max_keys)
	1170
	1171	xml_data = response["data"]
	1172	# extract list of info of uploads
	1173	upload_list = getListFromXml(xml_data, "Upload")
	1174	num_objects += len(upload_list)
	1175	if limit > num_objects:
	1176	max_keys = limit - num_objects
	1177
	1178	xml_truncated = getTextFromXml(xml_data, ".//IsTruncated")
	1179	if not xml_truncated or xml_truncated.lower() == "false":
	1180	truncated = False
	1181
	1182	if truncated:
	1183	if limit == -1 or num_objects < limit:
	1184	if upload_list:
	1185	next_key = getTextFromXml(xml_data, "NextKeyMarker")
	1186	if not next_key:
	1187	next_key = upload_list[-1]["Key"]
	1188	uri_params['KeyMarker'] = next_key
	1189
	1190	upload_id_marker = getTextFromXml(
	1191	xml_data, "NextUploadIdMarker")
	1192	if upload_id_marker:
	1193	uri_params['UploadIdMarker'] = upload_id_marker
	1194	elif 'UploadIdMarker' in uri_params:
	1195	# Clear any pre-existing value
	1196	del uri_params['UploadIdMarker']
	1197	else:
	1198	# Unexpectedly, the server lied, and so the previous
	1199	# response was not truncated. So, no new key to get.
	1200	yield False, upload_list
	1201	break
	1202	debug("Listing continues after '%s'" %
	1203	uri_params['KeyMarker'])
	1204	else:
	1205	yield truncated, upload_list
	1206	break
	1207	yield truncated, upload_list
	1208
	1209	def list_multipart(self, uri, upload_id, uri_params=None, limit=-1):
	1210	part_list = []
	1211	for truncated, parts in self.list_multipart_streaming(uri,
	1212	upload_id,
	1213	uri_params,
	1214	limit):
	1215	part_list.extend(parts)
	1216
	1217	return part_list
	1218
	1219	def list_multipart_streaming(self, uri, upload_id, uri_params=None,
	1220	limit=-1):
	1221	uri_params = uri_params and uri_params.copy() or {}
	1222
	1223	truncated = True
	1224	num_objects = 0
	1225	max_parts = limit
	1226
	1227	while truncated:
	1228	response = self.list_multipart_noparse(uri, upload_id,
	1229	uri_params, max_parts)
	1230
	1231	xml_data = response["data"]
	1232	# extract list of multipart upload parts
	1233	part_list = getListFromXml(xml_data, "Part")
	1234	num_objects += len(part_list)
	1235	if limit > num_objects:
	1236	max_parts = limit - num_objects
	1237
	1238	xml_truncated = getTextFromXml(xml_data, ".//IsTruncated")
	1239	if not xml_truncated or xml_truncated.lower() == "false":
	1240	truncated = False
	1241
	1242	if truncated:
	1243	if limit == -1 or num_objects < limit:
	1244	if part_list:
	1245	next_part_number = getTextFromXml(
	1246	xml_data, "NextPartNumberMarker")
	1247	if not next_part_number:
	1248	next_part_number = part_list[-1]["PartNumber"]
	1249	uri_params['part-number-marker'] = next_part_number
	1250	else:
	1251	# Unexpectedly, the server lied, and so the previous
	1252	# response was not truncated. So, no new part to get.
	1253	yield False, part_list
	1254	break
	1255	debug("Listing continues after Part '%s'" %
	1256	uri_params['part-number-marker'])
	1257	else:
	1258	yield truncated, part_list
	1259	break
	1260	yield truncated, part_list
	1261
	1262	def list_multipart_noparse(self, uri, upload_id, uri_params=None,
	1263	max_parts=-1):
	1264	if uri_params is None:
	1265	uri_params = {}
	1266	if max_parts != -1:
	1267	uri_params['max-parts'] = str(max_parts)
	1268	uri_params['uploadId'] = upload_id
	1269	request = self.create_request("OBJECT_GET", uri=uri,
	1270	uri_params=uri_params)
1066	1271	response = self.send_request(request)
1067	1272	return response
1068	1273
1069	1274	def abort_multipart(self, uri, id):
1070	1275	request = self.create_request("OBJECT_DELETE", uri = uri,
1071		uri_params = {'uploadId': id})
1072		response = self.send_request(request)
1073		return response
1074
1075		def list_multipart(self, uri, id):
1076		request = self.create_request("OBJECT_GET", uri = uri,
1077	1276	uri_params = {'uploadId': id})
1078	1277	response = self.send_request(request)
1079	1278	return response

1347	1546	if response["status"] == 405: # Method Not Allowed. Don't retry.
1348	1547	raise S3Error(response)
1349	1548
1350		if response["status"] >= 500:
	1549	if response["status"] >= 500 or response["status"] == 429:
1351	1550	e = S3Error(response)
1352	1551
1353	1552	if response["status"] == 501:

1364	1563
1365	1564	if response["status"] < 200 or response["status"] > 299:
1366	1565	raise S3Error(response)
	1566
	1567	return response
	1568
	1569	def send_request_with_progress(self, request, labels, operation_size=0):
	1570	"""Wrapper around send_request for slow requests.
	1571
	1572	To be able to show progression for small requests
	1573	"""
	1574	if not self.config.progress_meter:
	1575	info("Sending slow request, please wait...")
	1576	return self.send_request(request)
	1577
	1578	if 'action' not in labels:
	1579	labels[u'action'] = u'request'
	1580	progress = self.config.progress_class(labels, operation_size)
	1581
	1582	try:
	1583	response = self.send_request(request)
	1584	except Exception as exc:
	1585	progress.done("failed")
	1586	raise
	1587
	1588	progress.update(current_position=operation_size)
	1589	progress.done("done")
1367	1590
1368	1591	return response
1369	1592

1444	1667	http_response.read()
1445	1668	conn.c._HTTPConnection__state = ConnMan._CS_REQ_SENT
1446	1669
1447		while (size_left > 0):
	1670	while size_left > 0:
1448	1671	#debug("SendFile: Reading up to %d bytes from '%s' - remaining bytes: %s" % (self.config.send_chunk, filename, size_left))
1449	1672	l = min(self.config.send_chunk, size_left)
1450	1673	if buffer == '':
1451	1674	data = stream.read(l)
1452	1675	else:
1453	1676	data = buffer
	1677
	1678	if not data:
	1679	raise InvalidFileError("File smaller than expected. Was the file truncated?")
1454	1680
1455	1681	if self.config.limitrate > 0:
1456	1682	start_time = time.time()

1483	1709	ConnMan.put(conn)
1484	1710	debug(u"Response:\n" + pprint.pformat(response))
1485	1711	except ParameterError as e:
	1712	raise
	1713	except InvalidFileError as e:
	1714	if self.config.progress_meter:
	1715	progress.done("failed")
1486	1716	raise
1487	1717	except Exception as e:
1488	1718	if self.config.progress_meter:

1506	1736	response["data"] = http_response.read()
1507	1737	response["size"] = size_total
1508	1738	known_error = True
1509		except:
	1739	except Exception:
1510	1740	error("Cannot retrieve any response status before encountering an EPIPE or ECONNRESET exception")
1511	1741	if not known_error:
1512	1742	warning("Upload failed: %s (%s)" % (resource['uri'], e))

1564	1794	if response["status"] < 200 or response["status"] > 299:
1565	1795	try_retry = False
1566	1796	if response["status"] >= 500:
1567		## AWS internal error - retry
	1797	# AWS internal error - retry
1568	1798	try_retry = True
1569	1799	if response["status"] == 503:
1570	1800	## SlowDown error
1571	1801	throttle = throttle and throttle * 5 or 0.01
	1802	elif response["status"] == 429:
	1803	# Not an AWS error, but s3 compatible server possible error:
	1804	# TooManyRequests/Busy/slowdown
	1805	try_retry = True
	1806	throttle = throttle and throttle * 5 or 0.01
1572	1807	elif response["status"] >= 400:
1573	1808	err = S3Error(response)
1574	1809	## Retriable client error?

1585	1820	return self.send_file(request, stream, labels, buffer, throttle,
1586	1821	retries - 1, offset, chunk_size, use_expect_continue)
1587	1822	else:
1588		warning("Too many failures. Giving up on '%s'" % (filename))
1589		raise S3UploadError
	1823	warning("Too many failures. Giving up on '%s'" % filename)
	1824	raise S3UploadError("Too many failures. Giving up on '%s'"
	1825	% filename)
1590	1826
1591	1827	## Non-recoverable error
1592	1828	raise S3Error(response)

1602	1838	retries - 1, offset, chunk_size, use_expect_continue)
1603	1839	else:
1604	1840	warning("Too many failures. Giving up on '%s'" % (filename))
1605		raise S3UploadError
1606
1607		return response
1608
1609		def send_file_multipart(self, stream, headers, uri, size, extra_label = ""):
	1841	raise S3UploadError("Too many failures. Giving up on '%s'"
	1842	% filename)
	1843
	1844	return response
	1845
	1846	def send_file_multipart(self, stream, headers, uri, size, extra_label=""):
1610	1847	timestamp_start = time.time()
1611		upload = MultiPartUpload(self, stream, uri, headers)
	1848	upload = MultiPartUpload(self, stream, uri, headers, size)
1612	1849	upload.upload_all_parts(extra_label)
1613	1850	response = upload.complete_multipart_upload()
1614	1851	timestamp_end = time.time()

1621	1858	# raise S3UploadError
1622	1859	raise S3UploadError(getTextFromXml(response["data"], 'Message'))
1623	1860	return response
	1861
	1862	def copy_file_multipart(self, src_uri, dst_uri, size, headers,
	1863	extra_label=""):
	1864	return self.send_file_multipart(src_uri, headers, dst_uri, size,
	1865	extra_label)
1624	1866
1625	1867	def recv_file(self, request, stream, labels, start_position = 0, retries = _max_retries):
1626	1868	self.update_region_inner_request(request)

1827	2069
1828	2070	def compute_content_md5(body):
1829	2071	m = md5(encode_to_s3(body))
1830		base64md5 = base64.encodestring(m.digest())
	2072	base64md5 = encodestring(m.digest())
1831	2073	base64md5 = decode_from_s3(base64md5)
1832	2074	if base64md5[-1] == '\n':
1833	2075	base64md5 = base64md5[0:-1]

-1

S3/SortedDict.py less more

36	36	def keys(self):
37	37	# TODO fix
38	38	# Probably not anymore memory efficient on python2
39		# as now 2 copies ok keys to sort them.
	39	# as now 2 copies of keys to sort them.
40	40	keys = dict.keys(self)
41	41	if self.ignore_case:
42	42	# Translation map

+41

-265

S3/Utils.py less more

8	8	from __future__ import absolute_import, division
9	9
10	10	import os
11		import sys
12	11	import time
13	12	import re
14		import string
	13	import string as string_mod
15	14	import random
16	15	import errno
17		from calendar import timegm
18		from logging import debug, warning, error
19		from .ExitCodes import EX_OSFILE
20		try:
21		import dateutil.parser
22		except ImportError:
23		sys.stderr.write(u"""
24		!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
25		ImportError trying to import dateutil.parser.
26		Please install the python dateutil module:
27		$ sudo apt-get install python-dateutil
28		or
29		$ sudo yum install python-dateutil
30		or
31		$ pip install python-dateutil
32		!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
33		""")
34		sys.stderr.flush()
35		sys.exit(EX_OSFILE)
36
37		try:
38		from urllib import quote
39		except ImportError:
40		# python 3 support
41		from urllib.parse import quote
	16	from hashlib import md5
	17	from logging import debug
	18
42	19
43	20	try:
44	21	unicode

47	24	# In python 3, unicode -> str, and str -> bytes
48	25	unicode = str
49	26
	27
50	28	import S3.Config
51	29	import S3.Exceptions
52		import xml.dom.minidom
53
54		from hashlib import md5
55
56		import xml.etree.ElementTree as ET
	30
	31	from S3.BaseUtils import (base_urlencode_string, base_replace_nonprintables,
	32	base_unicodise, base_deunicodise)
	33
57	34
58	35	__all__ = []
59		def parseNodes(nodes):
60		## WARNING: Ignores text nodes from mixed xml/text.
61		## For instance <tag1>some text<tag2>other text</tag2></tag1>
62		## will be ignore "some text" node
63		retval = []
64		for node in nodes:
65		retval_item = {}
66		for child in node.getchildren():
67		name = decode_from_s3(child.tag)
68		if child.getchildren():
69		retval_item[name] = parseNodes([child])
70		else:
71		found_text = node.findtext(".//%s" % child.tag)
72		if found_text is not None:
73		retval_item[name] = decode_from_s3(found_text)
74		else:
75		retval_item[name] = None
76		retval.append(retval_item)
77		return retval
78		__all__.append("parseNodes")
79
80		def getPrettyFromXml(xmlstr):
81		xmlparser = xml.dom.minidom.parseString(xmlstr)
82		return xmlparser.toprettyxml()
83
84		__all__.append("getPrettyFromXml")
85
86		RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s\|\s)(<\w+) xmlns=[\'"](http://[^\'"]+)[\'"]', re.MULTILINE)
87
88		def stripNameSpace(xml):
89		"""
90		removeNameSpace(xml) -- remove top-level AWS namespace
91		Operate on raw byte(utf-8) xml string. (Not unicode)
92		"""
93		xmlns_match = RE_XML_NAMESPACE.match(xml)
94		if xmlns_match:
95		xmlns = xmlns_match.group(3)
96		xml = RE_XML_NAMESPACE.sub("\\1\\2", xml, 1)
97		else:
98		xmlns = None
99		return xml, xmlns
100		__all__.append("stripNameSpace")
101
102		def getTreeFromXml(xml):
103		xml, xmlns = stripNameSpace(encode_to_s3(xml))
104		try:
105		tree = ET.fromstring(xml)
106		if xmlns:
107		tree.attrib['xmlns'] = xmlns
108		return tree
109		except Exception as e:
110		error("Error parsing xml: %s", e)
111		error(xml)
112		raise
113
114		__all__.append("getTreeFromXml")
115
116		def getListFromXml(xml, node):
117		tree = getTreeFromXml(xml)
118		nodes = tree.findall('.//%s' % (node))
119		return parseNodes(nodes)
120		__all__.append("getListFromXml")
121
122		def getDictFromTree(tree):
123		ret_dict = {}
124		for child in tree.getchildren():
125		if child.getchildren():
126		## Complex-type child. Recurse
127		content = getDictFromTree(child)
128		else:
129		content = decode_from_s3(child.text) if child.text is not None else None
130		child_tag = decode_from_s3(child.tag)
131		if child_tag in ret_dict:
132		if not type(ret_dict[child_tag]) == list:
133		ret_dict[child_tag] = [ret_dict[child_tag]]
134		ret_dict[child_tag].append(content or "")
135		else:
136		ret_dict[child_tag] = content or ""
137		return ret_dict
138		__all__.append("getDictFromTree")
139
140		def getTextFromXml(xml, xpath):
141		tree = getTreeFromXml(xml)
142		if tree.tag.endswith(xpath):
143		return decode_from_s3(tree.text) if tree.text is not None else None
144		else:
145		result = tree.findtext(xpath)
146		return decode_from_s3(result) if result is not None else None
147		__all__.append("getTextFromXml")
148
149		def getRootTagName(xml):
150		tree = getTreeFromXml(xml)
151		return decode_from_s3(tree.tag) if tree.tag is not None else None
152		__all__.append("getRootTagName")
153
154		def xmlTextNode(tag_name, text):
155		el = ET.Element(tag_name)
156		el.text = decode_from_s3(text)
157		return el
158		__all__.append("xmlTextNode")
159
160		def appendXmlTextNode(tag_name, text, parent):
161		"""
162		Creates a new <tag_name> Node and sets
163		its content to 'text'. Then appends the
164		created Node to 'parent' element if given.
165		Returns the newly created Node.
166		"""
167		el = xmlTextNode(tag_name, text)
168		parent.append(el)
169		return el
170		__all__.append("appendXmlTextNode")
171
172		RE_S3_DATESTRING = re.compile('\.[0-9](?:[Z\\-\\+]?)')
173
174		def dateS3toPython(date):
175		# Reset milliseconds to 000
176		date = RE_S3_DATESTRING.sub(".000", date)
177		return dateutil.parser.parse(date, fuzzy=True)
178		__all__.append("dateS3toPython")
179
180		def dateS3toUnix(date):
181		## NOTE: This is timezone-aware and return the timestamp regarding GMT
182		return timegm(dateS3toPython(date).utctimetuple())
183		__all__.append("dateS3toUnix")
184
185		def dateRFC822toPython(date):
186		return dateutil.parser.parse(date, fuzzy=True)
187		__all__.append("dateRFC822toPython")
188
189		def dateRFC822toUnix(date):
190		return timegm(dateRFC822toPython(date).utctimetuple())
191		__all__.append("dateRFC822toUnix")
192
193		def formatSize(size, human_readable = False, floating_point = False):
	36
	37
	38	def formatSize(size, human_readable=False, floating_point=False):
194	39	size = floating_point and float(size) or int(size)
195	40	if human_readable:
196	41	coeffs = ['K', 'M', 'G', 'T']

203	48	return (size, "")
204	49	__all__.append("formatSize")
205	50
206		def formatDateTime(s3timestamp):
207		date_obj = dateutil.parser.parse(s3timestamp, fuzzy=True)
208		return date_obj.strftime("%Y-%m-%d %H:%M")
209		__all__.append("formatDateTime")
210	51
211	52	def convertHeaderTupleListToDict(list):
212	53	"""

218	59	return retval
219	60	__all__.append("convertHeaderTupleListToDict")
220	61
221		_rnd_chars = string.ascii_letters+string.digits
	62	_rnd_chars = string_mod.ascii_letters + string_mod.digits
222	63	_rnd_chars_len = len(_rnd_chars)
223	64	def rndstr(len):
224	65	retval = ""

227	68	len -= 1
228	69	return retval
229	70	__all__.append("rndstr")
	71
230	72
231	73	def mktmpsomething(prefix, randchars, createfunc):
232	74	old_umask = os.umask(0o077)

246	88	return dirname
247	89	__all__.append("mktmpsomething")
248	90
	91
249	92	def mktmpdir(prefix = os.getenv('TMP','/tmp') + "/tmpdir-", randchars = 10):
250	93	return mktmpsomething(prefix, randchars, os.mkdir)
251	94	__all__.append("mktmpdir")
	95
252	96
253	97	def mktmpfile(prefix = os.getenv('TMP','/tmp') + "/tmpfile-", randchars = 20):
254	98	createfunc = lambda filename : os.close(os.open(deunicodise(filename), os.O_CREAT \| os.O_EXCL))
255	99	return mktmpsomething(prefix, randchars, createfunc)
256	100	__all__.append("mktmpfile")
	101
257	102
258	103	def hash_file_md5(filename):
259	104	h = md5()

266	111	h.update(data)
267	112	return h.hexdigest()
268	113	__all__.append("hash_file_md5")
	114
269	115
270	116	def mkdir_with_parents(dir_name):
271	117	"""

294	140	return True
295	141	__all__.append("mkdir_with_parents")
296	142
297		def unicodise(string, encoding = None, errors = "replace", silent=False):
298		"""
299		Convert 'string' to Unicode or raise an exception.
300		"""
301
	143	def unicodise(string, encoding=None, errors='replace', silent=False):
302	144	if not encoding:
303	145	encoding = S3.Config.Config().encoding
304
305		if type(string) == unicode:
306		return string
307
308		if not silent:
309		debug("Unicodising %r using %s" % (string, encoding))
310		try:
311		return unicode(string, encoding, errors)
312		except UnicodeDecodeError:
313		raise UnicodeDecodeError("Conversion to unicode failed: %r" % string)
	146	return base_unicodise(string, encoding, errors, silent)
314	147	__all__.append("unicodise")
315	148
316		def unicodise_s(string, encoding = None, errors = "replace"):
	149
	150	def unicodise_s(string, encoding=None, errors='replace'):
317	151	"""
318	152	Alias to silent version of unicodise
319	153	"""
320	154	return unicodise(string, encoding, errors, True)
321	155	__all__.append("unicodise_s")
322	156
323		def deunicodise(string, encoding = None, errors = "replace", silent=False):
324		"""
325		Convert unicode 'string' to <type str>, by default replacing
326		all invalid characters with '?' or raise an exception.
327		"""
328
	157
	158	def deunicodise(string, encoding=None, errors='replace', silent=False):
329	159	if not encoding:
330	160	encoding = S3.Config.Config().encoding
331
332		if type(string) != unicode:
333		return string
334
335		if not silent:
336		debug("DeUnicodising %r using %s" % (string, encoding))
337		try:
338		return string.encode(encoding, errors)
339		except UnicodeEncodeError:
340		raise UnicodeEncodeError("Conversion from unicode failed: %r" % string)
	161	return base_deunicodise(string, encoding, errors, silent)
341	162	__all__.append("deunicodise")
342	163
343		def deunicodise_s(string, encoding = None, errors = "replace"):
	164
	165	def deunicodise_s(string, encoding=None, errors='replace'):
344	166	"""
345	167	Alias to silent version of deunicodise
346	168	"""
347	169	return deunicodise(string, encoding, errors, True)
348	170	__all__.append("deunicodise_s")
349	171
350		def unicodise_safe(string, encoding = None):
	172
	173	def unicodise_safe(string, encoding=None):
351	174	"""
352	175	Convert 'string' to Unicode according to current encoding
353	176	and replace all invalid characters with '?'

356	179	return unicodise(deunicodise(string, encoding), encoding).replace(u'\ufffd', '?')
357	180	__all__.append("unicodise_safe")
358	181
359		def decode_from_s3(string, errors = "replace"):
360		"""
361		Convert S3 UTF-8 'string' to Unicode or raise an exception.
362		"""
363		if type(string) == unicode:
364		return string
365		# Be quiet by default
366		#debug("Decoding string from S3: %r" % string)
367		try:
368		return unicode(string, "UTF-8", errors)
369		except UnicodeDecodeError:
370		raise UnicodeDecodeError("Conversion to unicode failed: %r" % string)
371		__all__.append("decode_from_s3")
372
373		def encode_to_s3(string, errors = "replace"):
374		"""
375		Convert Unicode to S3 UTF-8 'string', by default replacing
376		all invalid characters with '?' or raise an exception.
377		"""
378		if type(string) != unicode:
379		return string
380		# Be quiet by default
381		#debug("Encoding string to S3: %r" % string)
382		try:
383		return string.encode("UTF-8", errors)
384		except UnicodeEncodeError:
385		raise UnicodeEncodeError("Conversion from unicode failed: %r" % string)
386		__all__.append("encode_to_s3")
387	182
388	183	## Low level methods
389		def urlencode_string(string, urlencoding_mode = None, unicode_output=False):
390		string = encode_to_s3(string)
391
	184	def urlencode_string(string, urlencoding_mode=None, unicode_output=False):
392	185	if urlencoding_mode is None:
393	186	urlencoding_mode = S3.Config.Config().urlencoding_mode
394
395		if urlencoding_mode == "verbatim":
396		## Don't do any pre-processing
397		return string
398
399		encoded = quote(string, safe="~/")
400		debug("String '%s' encoded to '%s'" % (string, encoded))
401		if unicode_output:
402		return decode_from_s3(encoded)
403		else:
404		return encode_to_s3(encoded)
	187	return base_urlencode_string(string, urlencoding_mode, unicode_output)
405	188	__all__.append("urlencode_string")
	189
406	190
407	191	def replace_nonprintables(string):
408	192	"""

411	195	Replaces all non-printable characters 'ch' in 'string'
412	196	where ord(ch) <= 26 with ^@, ^A, ... ^Z
413	197	"""
414		new_string = ""
415		modified = 0
416		for c in string:
417		o = ord(c)
418		if (o <= 31):
419		new_string += "^" + chr(ord('@') + o)
420		modified += 1
421		elif (o == 127):
422		new_string += "^?"
423		modified += 1
424		else:
425		new_string += c
426		if modified and S3.Config.Config().urlencoding_mode != "fixbucket":
427		warning("%d non-printable characters replaced in: %s" % (modified, new_string))
428		return new_string
	198	warning_message = (S3.Config.Config().urlencoding_mode != "fixbucket")
	199	return base_replace_nonprintables(string, warning_message)
429	200	__all__.append("replace_nonprintables")
	201
430	202
431	203	def time_to_epoch(t):
432	204	"""Convert time specified in a variety of forms into UNIX epoch time.

462	234	raise S3.Exceptions.ParameterError('Unable to convert %r to an epoch time. Pass an epoch time. Try `date -d \'now + 1 year\' +%%s` (shell) or time.mktime (Python).' % t)
463	235
464	236
465		def check_bucket_name(bucket, dns_strict = True):
	237	def check_bucket_name(bucket, dns_strict=True):
466	238	if dns_strict:
467	239	invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
468	240	if invalid:

497	269	return False
498	270	__all__.append("check_bucket_name_dns_conformity")
499	271
	272
500	273	def check_bucket_name_dns_support(bucket_host, bucket_name):
501	274	"""
502	275	Check whether either the host_bucket support buckets and

507	280
508	281	return check_bucket_name_dns_conformity(bucket_name)
509	282	__all__.append("check_bucket_name_dns_support")
	283
510	284
511	285	def getBucketFromHostname(hostname):
512	286	"""

528	302	return m.group(1), True
529	303	__all__.append("getBucketFromHostname")
530	304
	305
531	306	def getHostnameFromBucket(bucket):
532	307	return S3.Config.Config().host_bucket.lower() % { 'bucket' : bucket }
533	308	__all__.append("getHostnameFromBucket")

540	315	mfile.seek(offset)
541	316	while size_left > 0:
542	317	data = mfile.read(min(send_chunk, size_left))
	318	if not data:
	319	break
543	320	md5_hash.update(data)
544	321	size_left -= len(data)
545	322	else:

575	352
576	353
577	354	# vim:et:ts=4:sts=4:ai
578

+217

-129

s3cmd less more

24	24
25	25	if sys.version_info < (2, 6):
26	26	sys.stderr.write(u"ERROR: Python 2.6 or higher required, sorry.\n")
27		sys.exit(EX_OSFILE)
	27	# 72 == EX_OSFILE
	28	sys.exit(72)
28	29
29	30	PY3 = (sys.version_info >= (3, 0))
30	31

50	51
51	52	try:
52	53	import htmlentitydefs
53		except:
	54	except Exception:
54	55	# python 3 support
55	56	import html.entities as htmlentitydefs
56	57

60	61	# python 3 support
61	62	# In python 3, unicode -> str, and str -> bytes
62	63	unicode = str
	64
	65	try:
	66	unichr
	67	except NameError:
	68	# python 3 support
	69	# In python 3, unichr was removed as chr can now do the job
	70	unichr = chr
63	71
64	72	try:
65	73	from shutil import which

242	250	def cmd_bucket_create(args):
243	251	cfg = Config()
244	252	s3 = S3(cfg)
	253
245	254	for arg in args:
246	255	uri = S3Uri(arg)
247	256	if not uri.type == "s3" or not uri.has_bucket() or uri.has_object():
248	257	raise ParameterError("Expecting S3 URI with just the bucket name set instead of '%s'" % arg)
249	258	try:
250		response = s3.bucket_create(uri.bucket(), cfg.bucket_location)
	259	response = s3.bucket_create(uri.bucket(), cfg.bucket_location, cfg.extra_headers)
251	260	output(u"Bucket '%s' created" % uri.uri())
252	261	except S3Error as e:
253	262	if e.info["Code"] in S3.codes:

420	429	seq += 1
421	430
422	431	uri_final = S3Uri(local_list[key]['remote_uri'])
	432	try:
	433	src_md5 = local_list.get_md5(key)
	434	except IOError:
	435	src_md5 = None
423	436
424	437	extra_headers = copy(cfg.extra_headers)
425	438	full_name_orig = local_list[key]['full_name']

427	440	seq_label = "[%d of %d]" % (seq, local_count)
428	441	if Config().encrypt:
429	442	gpg_exitcode, full_name, extra_headers["x-amz-meta-s3tools-gpgenc"] = gpg_encrypt(full_name_orig)
430		attr_header = _build_attr_header(local_list, key)
	443	attr_header = _build_attr_header(local_list[key], key, src_md5)
431	444	debug(u"attr_header: %s" % attr_header)
432	445	extra_headers.update(attr_header)
433	446	try:

529	542	if remote_count > 1:
530	543	raise ParameterError("Destination must be a directory or stdout when downloading multiple sources.")
531	544	remote_list[remote_list.keys()[0]]['local_filename'] = destination_base
532		elif os.path.isdir(deunicodise(destination_base)):
	545	else:
533	546	if destination_base[-1] != os.path.sep:
534	547	destination_base += os.path.sep
535	548	for key in remote_list:

537	550	if os.path.sep != "/":
538	551	local_filename = os.path.sep.join(local_filename.split("/"))
539	552	remote_list[key]['local_filename'] = local_filename
540		else:
541		raise InternalError("WTF? Is it a dir or not? -- %s" % destination_base)
542	553
543	554	if cfg.dry_run:
544	555	for key in exclude_list:

596	607	dst_stream.close()
597	608	raise ParameterError(u"File %s already exists. Use either of --force / --continue / --skip-existing or give it a new name." % destination)
598	609	except IOError as e:
599		error(u"Skipping %s: %s" % (destination, e.strerror))
	610	error(u"Creation of file '%s' failed (Reason: %s)"
	611	% (destination, e.strerror))
	612	if cfg.stop_on_error:
	613	error(u"Exiting now because of --stop-on-error")
	614	raise
	615	ret = EX_PARTIAL
600	616	continue
	617
601	618	try:
602	619	try:
603	620	response = s3.object_get(uri, dst_stream, destination, start_position = start_position, extra_label = seq_label)
604	621	finally:
605	622	dst_stream.close()
606	623	except S3DownloadError as e:
607		error(u"%s: Skipping that file. This is usually a transient error, please try again later." % e)
608		if not file_exists: # Delete, only if file didn't exist before!
	624	error(u"Download of '%s' failed (Reason: %s)" % (destination, e))
	625	# Delete, only if file didn't exist before!
	626	if not file_exists:
609	627	debug(u"object_get failed for '%s', deleting..." % (destination,))
610	628	os.unlink(deunicodise(destination))
611		ret = EX_PARTIAL
612		if cfg.stop_on_error:
613		ret = EX_DATAERR
614		break
	629	if cfg.stop_on_error:
	630	error(u"Exiting now because of --stop-on-error")
	631	raise
	632	ret = EX_PARTIAL
615	633	continue
616	634	except S3Error as e:
	635	error(u"Download of '%s' failed (Reason: %s)" % (destination, e))
617	636	if not file_exists: # Delete, only if file didn't exist before!
618	637	debug(u"object_get failed for '%s', deleting..." % (destination,))
619	638	os.unlink(deunicodise(destination))

633	652	if Config().delete_after_fetch:
634	653	s3.object_delete(uri)
635	654	output(u"File '%s' removed after fetch" % (uri))
636		return EX_OK
	655	return ret
637	656
638	657	def cmd_object_del(args):
639	658	cfg = Config()

830	849	cfg = Config()
831	850	if action_str == 'modify':
832	851	if len(args) < 1:
833		raise ParameterError("Expecting one or more S3 URIs for " + action_str)
	852	raise ParameterError("Expecting one or more S3 URIs for "
	853	+ action_str)
834	854	destination_base = None
835	855	else:
836	856	if len(args) < 2:
837		raise ParameterError("Expecting two or more S3 URIs for " + action_str)
	857	raise ParameterError("Expecting two or more S3 URIs for "
	858	+ action_str)
838	859	dst_base_uri = S3Uri(args.pop())
839	860	if dst_base_uri.type != "s3":
840		raise ParameterError("Destination must be S3 URI. To download a file use 'get' or 'sync'.")
	861	raise ParameterError("Destination must be S3 URI. To download a "
	862	"file use 'get' or 'sync'.")
841	863	destination_base = dst_base_uri.uri()
842	864
843	865	scoreboard = ExitScoreboard()
844	866
845		remote_list, exclude_list, remote_total_size = fetch_remote_list(args, require_attribs = False)
	867	remote_list, exclude_list, remote_total_size = \
	868	fetch_remote_list(args, require_attribs=False)
846	869
847	870	remote_count = len(remote_list)
848	871

853	876	# so we don't need to test for it here.
854	877	if not destination_base.endswith('/') \
855	878	and (len(remote_list) > 1 or cfg.recursive):
856		raise ParameterError("Destination must be a directory and end with '/' when acting on a folder content or on multiple sources.")
	879	raise ParameterError("Destination must be a directory and end with"
	880	" '/' when acting on a folder content or on "
	881	"multiple sources.")
857	882
858	883	if cfg.recursive:
859	884	for key in remote_list:

872	897	for key in exclude_list:
873	898	output(u"exclude: %s" % key)
874	899	for key in remote_list:
875		output(u"%s: '%s' -> '%s'" % (action_str, remote_list[key]['object_uri_str'], remote_list[key]['dest_name']))
	900	output(u"%s: '%s' -> '%s'" % (action_str,
	901	remote_list[key]['object_uri_str'],
	902	remote_list[key]['dest_name']))
876	903
877	904	warning(u"Exiting now because of --dry-run")
878	905	return EX_OK

885	912	item = remote_list[key]
886	913	src_uri = S3Uri(item['object_uri_str'])
887	914	dst_uri = S3Uri(item['dest_name'])
	915	src_size = item.get('size')
888	916
889	917	extra_headers = copy(cfg.extra_headers)
890	918	try:
891		response = process_fce(src_uri, dst_uri, extra_headers)
892		output(message % { "src" : src_uri, "dst" : dst_uri })
	919	response = process_fce(src_uri, dst_uri, extra_headers,
	920	src_size=src_size,
	921	extra_label=seq_label)
	922	output(message % {"src": src_uri, "dst": dst_uri,
	923	"extra": seq_label})
893	924	if Config().acl_public:
894	925	info(u"Public URL is: %s" % dst_uri.public_url())
895	926	scoreboard.success()
896		except S3Error as e:
897		if e.code == "NoSuchKey":
	927	except (S3Error, S3UploadError) as exc:
	928	if isinstance(exc, S3Error) and exc.code == "NoSuchKey":
898	929	scoreboard.notfound()
899	930	warning(u"Key not found %s" % item['object_uri_str'])
900	931	else:
901	932	scoreboard.failed()
902		if cfg.stop_on_error: break
	933	error(u"Copy failed for: '%s' (%s)", item['object_uri_str'],
	934	exc)
	935
	936	if cfg.stop_on_error:
	937	break
903	938	return scoreboard.rc()
904	939
905	940	def cmd_cp(args):
906	941	s3 = S3(Config())
907		return subcmd_cp_mv(args, s3.object_copy, "copy", u"remote copy: '%(src)s' -> '%(dst)s'")
	942	return subcmd_cp_mv(args, s3.object_copy, "copy",
	943	u"remote copy: '%(src)s' -> '%(dst)s' %(extra)s")
908	944
909	945	def cmd_modify(args):
910	946	s3 = S3(Config())
911		return subcmd_cp_mv(args, s3.object_modify, "modify", u"modify: '%(src)s'")
	947	return subcmd_cp_mv(args, s3.object_modify, "modify",
	948	u"modify: '%(src)s' %(extra)s")
912	949
913	950	def cmd_mv(args):
914	951	s3 = S3(Config())
915		return subcmd_cp_mv(args, s3.object_move, "move", u"move: '%(src)s' -> '%(dst)s'")
	952	return subcmd_cp_mv(args, s3.object_move, "move",
	953	u"move: '%(src)s' -> '%(dst)s' %(extra)s")
916	954
917	955	def cmd_info(args):
918	956	cfg = Config()

946	984	else:
947	985	info = s3.bucket_info(uri)
948	986	output(u"%s (bucket):" % uri.uri())
949		output(u" Location: %s" % info['bucket-location'])
950		output(u" Payer: %s" % info['requester-pays'])
	987	output(u" Location: %s" % (info['bucket-location']
	988	or 'none'))
	989	output(u" Payer: %s" % (info['requester-pays']
	990	or 'none'))
951	991	expiration = s3.expiration_info(uri, cfg.bucket_location)
952		if expiration:
	992	if expiration and expiration['prefix'] is not None:
953	993	expiration_desc = "Expiration Rule: "
954	994	if expiration['prefix'] == "":
955	995	expiration_desc += "all objects in this bucket "
956		else:
	996	elif expiration['prefix'] is not None:
957	997	expiration_desc += "objects with key prefix '" + expiration['prefix'] + "' "
958	998	expiration_desc += "will expire in '"
959	999	if expiration['days']:

1108	1148	item = src_list[file]
1109	1149	src_uri = S3Uri(item['object_uri_str'])
1110	1150	dst_uri = S3Uri(item['target_uri'])
	1151	src_size = item.get('size')
1111	1152	seq_label = "[%d of %d]" % (seq, src_count)
1112	1153	extra_headers = copy(cfg.extra_headers)
1113	1154	try:
1114		response = s3.object_copy(src_uri, dst_uri, extra_headers)
1115		output(u"remote copy: '%s' -> '%s'" % (src_uri, dst_uri))
	1155	response = s3.object_copy(src_uri, dst_uri, extra_headers,
	1156	src_size=src_size,
	1157	extra_label=seq_label)
	1158	output(u"remote copy: '%s' -> '%s' %s" %
	1159	(src_uri, dst_uri, seq_label))
1116	1160	total_nb_files += 1
1117	1161	total_size += item.get(u'size', 0)
1118		except S3Error as exc:
	1162	except (S3Error, S3UploadError) as exc:
1119	1163	ret = EX_PARTIAL
1120	1164	error(u"File '%s' could not be copied: %s", src_uri, exc)
1121	1165	if cfg.stop_on_error:

1135	1179	total_files_copied += nb_files
1136	1180	total_size_copied += size
1137	1181
1138
1139		n_copied, bytes_saved, failed_copy_files = remote_copy(s3, copy_pairs, destination_base, None)
	1182	n_copied, bytes_saved, failed_copy_files = remote_copy(
	1183	s3, copy_pairs, destination_base, None, False)
1140	1184	total_files_copied += n_copied
1141	1185	total_size_copied += bytes_saved
1142	1186
1143	1187	#process files not copied
1144		debug("Process files that were not remote copied")
1145		failed_copy_count = len (failed_copy_files)
	1188	debug("Process files that were not remotely copied")
	1189	failed_copy_count = len(failed_copy_files)
1146	1190	for key in failed_copy_files:
1147	1191	failed_copy_files[key]['target_uri'] = destination_base + key
1148	1192	status, seq, nb_files, size = _upload(failed_copy_files, seq, src_count + update_count + failed_copy_count)

1294	1338	deleted_count, deleted_size = (0, 0)
1295	1339
1296	1340	def _download(remote_list, seq, total, total_size, dir_cache):
1297		original_umask = os.umask(0);
1298		os.umask(original_umask);
	1341	original_umask = os.umask(0)
	1342	os.umask(original_umask)
1299	1343	file_list = remote_list.keys()
1300	1344	file_list.sort()
1301	1345	ret = EX_OK

1368	1412	# Try to remove the temp file if it exists
1369	1413	if chkptfname_b:
1370	1414	os.unlink(chkptfname_b)
1371		except:
	1415	except Exception:
1372	1416	pass
1373	1417
1374	1418	if allow_partial and not cfg.stop_on_error:

1411	1455	try:
1412	1456	# set permissions on destination file
1413	1457	if not is_empty_directory: # a normal file
1414		mode = 0o777 - original_umask;
1415		else: # an empty directory, make them readable/executable
	1458	mode = 0o777 - original_umask
	1459	else:
	1460	# an empty directory, make them readable/executable
1416	1461	mode = 0o775
1417	1462	debug(u"mode=%s" % oct(mode))
1418		os.chmod(deunicodise(dst_file), mode);
	1463	os.chmod(deunicodise(dst_file), mode)
1419	1464	except:
1420	1465	raise
1421	1466

1465	1510	finally:
1466	1511	try:
1467	1512	os.remove(chkptfname_b)
1468		except:
	1513	except Exception:
1469	1514	pass
1470	1515
1471	1516	speed_fmt = formatSize(response["speed"], human_readable = True, floating_point = True)

1528	1573	# For instance all empty files would become hardlinked together!
1529	1574	saved_bytes = 0
1530	1575	failed_copy_list = FileDict()
1531		for (src_obj, dst1, relative_file) in copy_pairs:
	1576	for (src_obj, dst1, relative_file, md5) in copy_pairs:
1532	1577	src_file = os.path.join(destination_base, dst1)
1533	1578	dst_file = os.path.join(destination_base, relative_file)
1534	1579	dst_dir = os.path.dirname(deunicodise(dst_file))

1544	1589	failed_copy_list[relative_file] = src_obj
1545	1590	return len(copy_pairs), saved_bytes, failed_copy_list
1546	1591
1547		def remote_copy(s3, copy_pairs, destination_base, uploaded_objects_list=None):
	1592	def remote_copy(s3, copy_pairs, destination_base, uploaded_objects_list=None,
	1593	metadata_update=False):
1548	1594	cfg = Config()
1549	1595	saved_bytes = 0
1550	1596	failed_copy_list = FileDict()
1551		for (src_obj, dst1, dst2) in copy_pairs:
	1597	seq = 0
	1598	src_count = len(copy_pairs)
	1599	for (src_obj, dst1, dst2, src_md5) in copy_pairs:
	1600	seq += 1
1552	1601	debug(u"Remote Copying from %s to %s" % (dst1, dst2))
1553	1602	dst1_uri = S3Uri(destination_base + dst1)
1554	1603	dst2_uri = S3Uri(destination_base + dst2)
	1604	src_obj_size = src_obj.get(u'size', 0)
	1605	seq_label = "[%d of %d]" % (seq, src_count)
1555	1606	extra_headers = copy(cfg.extra_headers)
	1607	if metadata_update:
	1608	# source is a real local file with its own personal metadata
	1609	attr_header = _build_attr_header(src_obj, dst2, src_md5)
	1610	debug(u"attr_header: %s" % attr_header)
	1611	extra_headers.update(attr_header)
	1612	extra_headers['content-type'] = \
	1613	s3.content_type(filename=src_obj['full_name'])
1556	1614	try:
1557		s3.object_copy(dst1_uri, dst2_uri, extra_headers)
1558		saved_bytes += src_obj.get(u'size', 0)
1559		output(u"remote copy: '%s' -> '%s'" % (dst1, dst2))
	1615	s3.object_copy(dst1_uri, dst2_uri, extra_headers,
	1616	src_size=src_obj_size,
	1617	extra_label=seq_label)
	1618	output(u"remote copy: '%s' -> '%s' %s" % (dst1, dst2, seq_label))
	1619	saved_bytes += src_obj_size
1560	1620	if uploaded_objects_list is not None:
1561	1621	uploaded_objects_list.append(dst2)
1562		except:
	1622	except Exception:
1563	1623	warning(u"Unable to remote copy files '%s' -> '%s'" % (dst1_uri, dst2_uri))
1564	1624	failed_copy_list[dst2] = src_obj
1565	1625	return (len(copy_pairs), saved_bytes, failed_copy_list)
1566	1626
1567		def _build_attr_header(local_list, src):
	1627	def _build_attr_header(src_obj, src_relative_name, md5=None):
1568	1628	cfg = Config()
1569	1629	attrs = {}
1570	1630	if cfg.preserve_attrs:
1571	1631	for attr in cfg.preserve_attrs_list:
	1632	val = None
1572	1633	if attr == 'uname':
1573	1634	try:
1574		val = Utils.urlencode_string(Utils.getpwuid_username(local_list[src]['uid']), unicode_output=True)
	1635	val = Utils.urlencode_string(Utils.getpwuid_username(src_obj['uid']), unicode_output=True)
1575	1636	except (KeyError, TypeError):
1576	1637	attr = "uid"
1577		val = local_list[src].get('uid')
	1638	val = src_obj.get('uid')
1578	1639	if val:
1579		warning(u"%s: Owner username not known. Storing UID=%d instead." % (src, val))
	1640	warning(u"%s: Owner username not known. Storing UID=%d instead." % (src_relative_name, val))
1580	1641	elif attr == 'gname':
1581	1642	try:
1582		val = Utils.urlencode_string(Utils.getgrgid_grpname(local_list[src].get('gid')), unicode_output=True)
	1643	val = Utils.urlencode_string(Utils.getgrgid_grpname(src_obj.get('gid')), unicode_output=True)
1583	1644	except (KeyError, TypeError):
1584	1645	attr = "gid"
1585		val = local_list[src].get('gid')
	1646	val = src_obj.get('gid')
1586	1647	if val:
1587		warning(u"%s: Owner groupname not known. Storing GID=%d instead." % (src, val))
	1648	warning(u"%s: Owner groupname not known. Storing GID=%d instead." % (src_relative_name, val))
1588	1649	elif attr != "md5":
1589	1650	try:
1590		val = getattr(local_list[src]['sr'], 'st_' + attr)
1591		except:
	1651	val = getattr(src_obj['sr'], 'st_' + attr)
	1652	except Exception:
1592	1653	val = None
1593	1654	if val is not None:
1594	1655	attrs[attr] = val
1595	1656
1596		if 'md5' in cfg.preserve_attrs_list:
1597		try:
1598		val = local_list.get_md5(src)
1599		if val is not None:
1600		attrs['md5'] = val
1601		except IOError:
1602		pass
	1657	if 'md5' in cfg.preserve_attrs_list and md5:
	1658	attrs['md5'] = md5
1603	1659
1604	1660	if attrs:
1605		result = ""
	1661	attr_str_list = []
1606	1662	for k in sorted(attrs.keys()):
1607		result += "%s:%s/" % (k, attrs[k])
1608		return {'x-amz-meta-s3cmd-attrs' : result[:-1]}
	1663	attr_str_list.append(u"%s:%s" % (k, attrs[k]))
	1664	attr_header = {'x-amz-meta-s3cmd-attrs': u'/'.join(attr_str_list)}
1609	1665	else:
1610		return {}
1611
	1666	attr_header = {}
	1667
	1668	return attr_header
1612	1669
1613	1670	def cmd_sync_local2remote(args):
1614	1671	cfg = Config()

1670	1727	seq += 1
1671	1728	item = local_list[file]
1672	1729	src = item['full_name']
	1730	try:
	1731	src_md5 = local_list.get_md5(file)
	1732	except IOError:
	1733	src_md5 = None
1673	1734	uri = S3Uri(item['remote_uri'])
1674	1735	seq_label = "[%d of %d]" % (seq, total)
1675	1736	extra_headers = copy(cfg.extra_headers)
1676	1737	try:
1677		attr_header = _build_attr_header(local_list, file)
	1738	attr_header = _build_attr_header(local_list[file],
	1739	file, src_md5)
1678	1740	debug(u"attr_header: %s" % attr_header)
1679	1741	extra_headers.update(attr_header)
1680	1742	response = s3.object_put(src, uri, extra_headers, extra_label = seq_label)

1688	1750	continue
1689	1751	except InvalidFileError as exc:
1690	1752	error(u"Upload of '%s' is not possible (Reason: %s)" % (item['full_name'], exc))
1691		ret = EX_PARTIAL
1692	1753	if cfg.stop_on_error:
1693	1754	ret = EX_OSFILE
1694	1755	error(u"Exiting now because of --stop-on-error")
1695	1756	raise
	1757	ret = EX_PARTIAL
1696	1758	continue
1697	1759	speed_fmt = formatSize(response["speed"], human_readable = True, floating_point = True)
1698	1760	if not cfg.progress_meter:

1761	1823	output(u"upload: '%s' -> '%s'" % (local_list[key]['full_name'], local_list[key]['remote_uri']))
1762	1824	for key in update_list:
1763	1825	output(u"upload: '%s' -> '%s'" % (update_list[key]['full_name'], update_list[key]['remote_uri']))
1764		for (src_obj, dst1, dst2) in copy_pairs:
	1826	for (src_obj, dst1, dst2, md5) in copy_pairs:
1765	1827	output(u"remote copy: '%s' -> '%s'" % (dst1, dst2))
1766	1828	if cfg.delete_removed:
1767	1829	for key in remote_list:

1792	1854	# uploaded_objects_list reference is passed so it can be filled with
1793	1855	# destination object of succcessful copies so that they can be
1794	1856	# invalidated by cf
1795		n_copies, saved_bytes, failed_copy_files = remote_copy(s3, copy_pairs,
1796		destination_base,
1797		uploaded_objects_list)
	1857	n_copies, saved_bytes, failed_copy_files = remote_copy(
	1858	s3, copy_pairs, destination_base, uploaded_objects_list, True)
1798	1859
1799	1860	#upload file that could not be copied
1800		debug("Process files that were not remote copied")
	1861	debug("Process files that were not remotely copied")
1801	1862	failed_copy_count = len(failed_copy_files)
1802	1863	_set_remote_uri(failed_copy_files, destination_base, single_file_local)
1803	1864	status, n, size_transferred = _upload(failed_copy_files, n, upload_count + failed_copy_count, size_transferred)

1885	1946	def cmd_sync(args):
1886	1947	cfg = Config()
1887	1948	if (len(args) < 2):
1888		raise ParameterError("Too few parameters! Expected: %s" % commands['sync']['param'])
	1949	syntax_msg = ''
	1950	commands_list = get_commands_list()
	1951	for cmd in commands_list:
	1952	if cmd.get('cmd') == 'sync':
	1953	syntax_msg = cmd.get('param', '')
	1954	break
	1955	raise ParameterError("Too few parameters! Expected: %s" % syntax_msg)
1889	1956	if cfg.delay_updates:
1890	1957	warning(u"`delay-updates` is obsolete.")
1891	1958

2075	2142	#id = ''
2076	2143	#if(len(args) > 1): id = args[1]
2077	2144
2078		response = s3.get_multipart(uri)
2079		debug(u"response - %s" % response['status'])
	2145	upload_list = s3.get_multipart(uri)
2080	2146	output(u"%s" % uri)
2081		tree = getTreeFromXml(response['data'])
2082		debug(parseNodes(tree))
	2147	debug(upload_list)
2083	2148	output(u"Initiated\tPath\tId")
2084		for mpupload in parseNodes(tree):
	2149	for mpupload in upload_list:
2085	2150	try:
2086		output(u"%s\t%s\t%s" % (mpupload['Initiated'], "s3://" + uri.bucket() + "/" + mpupload['Key'], mpupload['UploadId']))
	2151	output(u"%s\t%s\t%s" % (
	2152	mpupload['Initiated'],
	2153	"s3://" + uri.bucket() + "/" + mpupload['Key'],
	2154	mpupload['UploadId']))
2087	2155	except KeyError:
2088	2156	pass
2089	2157	return EX_OK

2106	2174	uri = S3Uri(args[0])
2107	2175	id = args[1]
2108	2176
2109		response = s3.list_multipart(uri, id)
2110		debug(u"response - %s" % response['status'])
2111		tree = getTreeFromXml(response['data'])
	2177	part_list = s3.list_multipart(uri, id)
2112	2178	output(u"LastModified\t\t\tPartNumber\tETag\tSize")
2113		for mpupload in parseNodes(tree):
	2179	for mpupload in part_list:
2114	2180	try:
2115		output(u"%s\t%s\t%s\t%s" % (mpupload['LastModified'], mpupload['PartNumber'], mpupload['ETag'], mpupload['Size']))
2116		except:
	2181	output(u"%s\t%s\t%s\t%s" % (mpupload['LastModified'],
	2182	mpupload['PartNumber'],
	2183	mpupload['ETag'],
	2184	mpupload['Size']))
	2185	except KeyError:
2117	2186	pass
2118	2187	return EX_OK
2119	2188

2187	2256	pass
2188	2257	return text # leave as is
2189	2258	text = text.encode('ascii', 'xmlcharrefreplace')
2190		return re.sub("&#?\w+;", _unescape_fixup, text)
	2259	return re.sub(r"&#?\w+;", _unescape_fixup, text)
2191	2260
2192	2261	cfg = Config()
2193	2262	cfg.urlencoding_mode = "fixbucket"

2199	2268	if culprit.type != "s3":
2200	2269	raise ParameterError("Expecting S3Uri instead of: %s" % arg)
2201	2270	response = s3.bucket_list_noparse(culprit.bucket(), culprit.object(), recursive = True)
2202		r_xent = re.compile("&#x[\da-fA-F]+;")
	2271	r_xent = re.compile(r"&#x[\da-fA-F]+;")
2203	2272	keys = re.findall("<Key>(.*?)</Key>", response['data'], re.MULTILINE \| re.UNICODE)
2204	2273	debug("Keys: %r" % keys)
2205	2274	for key in keys:

2304	2373
2305	2374	if getattr(cfg, "proxy_host") == "" and os.getenv("http_proxy"):
2306	2375	autodetected_encoding = locale.getpreferredencoding() or "UTF-8"
2307		re_match=re.match("(http://)?([^:]+):(\d+)",
	2376	re_match=re.match(r"(http://)?([^:]+):(\d+)",
2308	2377	unicodise_s(os.getenv("http_proxy"), autodetected_encoding))
2309	2378	if re_match:
2310	2379	setattr(cfg, "proxy_host", re_match.groups()[1])

2448	2517	with open(deunicodise(fname), "rt") as fn:
2449	2518	for pattern in fn:
2450	2519	pattern = unicodise(pattern).strip()
2451		if re.match("^#", pattern) or re.match("^\s*$", pattern):
	2520	if re.match("^#", pattern) or re.match(r"^\s*$", pattern):
2452	2521	continue
2453	2522	debug(u"%s: adding rule: %s" % (fname, pattern))
2454	2523	patterns_list.append(pattern)

2459	2528	return patterns_list
2460	2529
2461	2530	def process_patterns(patterns_list, patterns_from, is_glob, option_txt = ""):
2462		"""
	2531	r"""
2463	2532	process_patterns(patterns, patterns_from, is_glob, option_txt = "")
2464	2533	Process --exclude / --include GLOB and REGEXP patterns.
2465	2534	'option_txt' is 'exclude' / 'include' / 'rexclude' / 'rinclude'

2503	2572	{"cmd":"rm", "label":"Delete file from bucket (alias for del)", "param":"s3://BUCKET/OBJECT", "func":cmd_object_del, "argc":1},
2504	2573	#{"cmd":"mkdir", "label":"Make a virtual S3 directory", "param":"s3://BUCKET/path/to/dir", "func":cmd_mkdir, "argc":1},
2505	2574	{"cmd":"restore", "label":"Restore file from Glacier storage", "param":"s3://BUCKET/OBJECT", "func":cmd_object_restore, "argc":1},
2506		{"cmd":"sync", "label":"Synchronize a directory tree to S3 (checks files freshness using size and md5 checksum, unless overridden by options, see below)", "param":"LOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR", "func":cmd_sync, "argc":2},
	2575	{"cmd":"sync", "label":"Synchronize a directory tree to S3 (checks files freshness using size and md5 checksum, unless overridden by options, see below)", "param":"LOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR or s3://BUCKET[/PREFIX] s3://BUCKET[/PREFIX]", "func":cmd_sync, "argc":2},
2507	2576	{"cmd":"du", "label":"Disk usage by buckets", "param":"[s3://BUCKET[/PREFIX]]", "func":cmd_du, "argc":0},
2508	2577	{"cmd":"info", "label":"Get various information about Buckets or Files", "param":"s3://BUCKET[/OBJECT]", "func":cmd_info, "argc":1},
2509	2578	{"cmd":"cp", "label":"Copy object", "param":"s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]", "func":cmd_cp, "argc":2},

2567	2636	acl.grantAnonRead()
2568	2637	something_changed = True
2569	2638	elif cfg.acl_public == False: # we explicitely check for False, because it could be None
2570		if not acl.isAnonRead():
	2639	if not acl.isAnonRead() and not acl.isAnonWrite():
2571	2640	info(u"%s: already Private, skipping %s" % (uri, seq_label))
2572	2641	else:
2573	2642	acl.revokeAnonRead()
	2643	acl.revokeAnonWrite()
2574	2644	something_changed = True
2575	2645
2576	2646	# update acl with arguments

2597	2667	output(u"%s: ACL updated" % uri)
2598	2668
2599	2669	class OptionMimeType(Option):
2600		def check_mimetype(option, opt, value):
2601		if re.compile("^[a-z0-9]+/[a-z0-9+\.-]+(;.*)?$", re.IGNORECASE).match(value):
	2670	def check_mimetype(self, opt, value):
	2671	if re.compile(r"^[a-z0-9]+/[a-z0-9+\.-]+(;.*)?$", re.IGNORECASE).match(value):
2602	2672	return value
2603	2673	raise OptionValueError("option %s: invalid MIME-Type format: %r" % (opt, value))
2604	2674
2605	2675	class OptionS3ACL(Option):
2606		def check_s3acl(option, opt, value):
	2676	def check_s3acl(self, opt, value):
2607	2677	permissions = ('read', 'write', 'read_acp', 'write_acp', 'full_control', 'all')
2608	2678	try:
2609		permission, grantee = re.compile("^(\w+):(.+)$", re.IGNORECASE).match(value).groups()
	2679	permission, grantee = re.compile(r"^(\w+):(.+)$", re.IGNORECASE).match(value).groups()
2610	2680	if not permission or not grantee:
2611		raise
	2681	raise OptionValueError("option %s: invalid S3 ACL format: %r" % (opt, value))
2612	2682	if permission in permissions:
2613	2683	return { 'name' : grantee, 'permission' : permission.upper() }
2614	2684	else:
2615	2685	raise OptionValueError("option %s: invalid S3 ACL permission: %s (valid values: %s)" %
2616	2686	(opt, permission, ", ".join(permissions)))
2617		except:
	2687	except OptionValueError:
	2688	raise
	2689	except Exception:
2618	2690	raise OptionValueError("option %s: invalid S3 ACL format: %r" % (opt, value))
2619	2691
2620	2692	class OptionAll(OptionMimeType, OptionS3ACL):

2647	2719
2648	2720	config_file = None
2649	2721	if os.getenv("S3CMD_CONFIG"):
2650		config_file = unicodise_s(os.getenv("S3CMD_CONFIG"), autodetected_encoding)
	2722	config_file = unicodise_s(os.getenv("S3CMD_CONFIG"),
	2723	autodetected_encoding)
2651	2724	elif os.name == "nt" and os.getenv("USERPROFILE"):
2652		config_file = os.path.join(unicodise_s(os.getenv("USERPROFILE"), autodetected_encoding),
2653		os.getenv("APPDATA") and unicodise_s(os.getenv("APPDATA"), autodetected_encoding)
2654		or 'Application Data',
2655		"s3cmd.ini")
	2725	config_file = os.path.join(
	2726	unicodise_s(os.getenv("USERPROFILE"), autodetected_encoding),
	2727	os.getenv("APPDATA")
	2728	and unicodise_s(os.getenv("APPDATA"), autodetected_encoding)
	2729	or 'Application Data',
	2730	"s3cmd.ini")
2656	2731	else:
2657	2732	from os.path import expanduser
2658	2733	config_file = os.path.join(expanduser("~"), ".s3cfg")

2689	2764	optparser.add_option( "--restore-priority", dest="restore_priority", action="store", choices=['standard', 'expedited', 'bulk'], help="Priority for restoring files from S3 Glacier (only for 'restore' command). Choices available: bulk, standard, expedited")
2690	2765
2691	2766	optparser.add_option( "--delete-removed", dest="delete_removed", action="store_true", help="Delete destination objects with no corresponding source file [sync]")
2692		optparser.add_option( "--no-delete-removed", dest="delete_removed", action="store_false", help="Don't delete destination objects.")
	2767	optparser.add_option( "--no-delete-removed", dest="delete_removed", action="store_false", help="Don't delete destination objects [sync]")
2693	2768	optparser.add_option( "--delete-after", dest="delete_after", action="store_true", help="Perform deletes AFTER new uploads when delete-removed is enabled [sync]")
2694	2769	optparser.add_option( "--delay-updates", dest="delay_updates", action="store_true", help="OBSOLETE Put all updated files into place at end [sync]") # OBSOLETE
2695	2770	optparser.add_option( "--max-delete", dest="max_delete", action="store", help="Do not delete more than NUM files. [del] and [sync]", metavar="NUM")

2767	2842	optparser.add_option( "--cache-file", dest="cache_file", action="store", default="", metavar="FILE", help="Cache FILE containing local source MD5 values")
2768	2843	optparser.add_option("-q", "--quiet", dest="quiet", action="store_true", default=False, help="Silence output on stdout")
2769	2844	optparser.add_option( "--ca-certs", dest="ca_certs_file", action="store", default=None, help="Path to SSL CA certificate FILE (instead of system default)")
	2845	optparser.add_option( "--ssl-cert", dest="ssl_client_cert_file", action="store", default=None, help="Path to client own SSL certificate CRT_FILE")
	2846	optparser.add_option( "--ssl-key", dest="ssl_client_key_file", action="store", default=None, help="Path to client own SSL certificate private key KEY_FILE")
2770	2847	optparser.add_option( "--check-certificate", dest="check_ssl_certificate", action="store_true", help="Check SSL certificate validity")
2771	2848	optparser.add_option( "--no-check-certificate", dest="check_ssl_certificate", action="store_false", help="Do not check SSL certificate validity")
2772	2849	optparser.add_option( "--check-hostname", dest="check_ssl_hostname", action="store_true", help="Check SSL certificate hostname validity")

2872	2949
2873	2950	## Process --(no-)check-md5
2874	2951	if options.check_md5 == False:
2875		try:
	2952	if "md5" in cfg.sync_checks:
2876	2953	cfg.sync_checks.remove("md5")
	2954	if "md5" in cfg.preserve_attrs_list:
2877	2955	cfg.preserve_attrs_list.remove("md5")
2878		except Exception:
2879		pass
2880		if options.check_md5 == True:
2881		if cfg.sync_checks.count("md5") == 0:
	2956	elif options.check_md5 == True:
	2957	if "md5" not in cfg.sync_checks:
2882	2958	cfg.sync_checks.append("md5")
2883		if cfg.preserve_attrs_list.count("md5") == 0:
	2959	if "md5" not in cfg.preserve_attrs_list:
2884	2960	cfg.preserve_attrs_list.append("md5")
2885	2961
2886	2962	## Update Config with other parameters

3040	3116	branch found at:
3041	3117	https://github.com/s3tools/s3cmd
3042	3118	and have a look at the known issues list:
3043		https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutions
	3119	https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutions-(FAQ)
3044	3120	If the error persists, please report the
3045	3121	%s (removing any private
3046	3122	info as necessary) to:

3053	3129	try:
3054	3130	s = u' '.join([unicodise(a) for a in sys.argv])
3055	3131	except NameError:
3056		# Error happened before Utils module was yet imported to provide unicodise
	3132	# Error happened before Utils module was yet imported to provide
	3133	# unicodise
3057	3134	try:
3058	3135	s = u' '.join([(a) for a in sys.argv])
3059	3136	except UnicodeDecodeError:

3065	3142	try:
3066	3143	sys.stderr.write(u"Problem: %s: %s\n" % (e_class, e))
3067	3144	except UnicodeDecodeError:
3068		sys.stderr.write(u"Problem: [encoding safe] %r: %r\n" % (e_class, e))
	3145	sys.stderr.write(u"Problem: [encoding safe] %r: %r\n"
	3146	% (e_class, e))
3069	3147	try:
3070	3148	sys.stderr.write(u"S3cmd: %s\n" % PkgInfo.version)
3071	3149	except NameError:
3072		sys.stderr.write(u"S3cmd: unknown version. Module import problem?\n")
	3150	sys.stderr.write(u"S3cmd: unknown version."
	3151	"Module import problem?\n")
3073	3152	sys.stderr.write(u"python: %s\n" % sys.version)
3074	3153	try:
3075		sys.stderr.write(u"environment LANG=%s\n" % unicodise_s(os.getenv("LANG"), 'ascii'))
	3154	sys.stderr.write(u"environment LANG=%s\n"
	3155	% unicodise_s(os.getenv("LANG", "NOTSET"),
	3156	'ascii'))
3076	3157	except NameError:
3077		sys.stderr.write(u"environment LANG=%s\n" % os.getenv("LANG"))
	3158	# Error happened before Utils module was yet imported to provide
	3159	# unicodise
	3160	sys.stderr.write(u"environment LANG=%s\n"
	3161	% os.getenv("LANG", "NOTSET"))
3078	3162	sys.stderr.write(u"\n")
3079	3163	if type(tb) == unicode:
3080	3164	sys.stderr.write(tb)

3086	3170	sys.stderr.write("Your sys.path contains these entries:\n")
3087	3171	for path in sys.path:
3088	3172	sys.stderr.write(u"\t%s\n" % path)
3089		sys.stderr.write("Now the question is where have the s3cmd modules been installed?\n")
	3173	sys.stderr.write("Now the question is where have the s3cmd modules"
	3174	" been installed?\n")
3090	3175
3091	3176	sys.stderr.write(alert_header % (u"above lines", u""))
3092	3177

3105	3190	from S3.S3Uri import S3Uri
3106	3191	from S3 import Utils
3107	3192	from S3 import Crypto
3108		from S3.Utils import *
	3193	from S3.BaseUtils import (formatDateTime, getPrettyFromXml,
	3194	encode_to_s3, decode_from_s3)
	3195	from S3.Utils import (formatSize, unicodise_safe, unicodise_s,
	3196	unicodise, deunicodise, replace_nonprintables)
3109	3197	from S3.Progress import Progress, StatsInfo
3110	3198	from S3.CloudFront import Cmd as CfCmd
3111	3199	from S3.CloudFront import CloudFront

3186	3274	sys.exit(EX_OSERR)
3187	3275
3188	3276	except UnicodeEncodeError as e:
3189		lang = unicodise_s(os.getenv("LANG"), 'ascii')
	3277	lang = unicodise_s(os.getenv("LANG", "NOTSET"), 'ascii')
3190	3278	msg = """
3191	3279	You have encountered a UnicodeEncodeError. Your environment
3192	3280	variable LANG=%s may not specify a Unicode encoding (e.g. UTF-8).

+10

-3

s3cmd.1 less more

48	48	s3cmd \fBrestore\fR \fIs3://BUCKET/OBJECT\fR
49	49	Restore file from Glacier storage
50	50	.TP
51		s3cmd \fBsync\fR \fILOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR\fR
	51	s3cmd \fBsync\fR \fILOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR or s3://BUCKET[/PREFIX] s3://BUCKET[/PREFIX]\fR
52	52	Synchronize a directory tree to S3 (checks files freshness using size and md5 checksum, unless overridden by options, see below)
53	53	.TP
54	54	s3cmd \fBdu\fR \fI[s3://BUCKET[/PREFIX]]\fR

275	275	source file [sync]
276	276	.TP
277	277	\fB\-\-no\-delete\-removed\fR
278		Don't delete destination objects.
	278	Don't delete destination objects [sync]
279	279	.TP
280	280	\fB\-\-delete\-after\fR
281	281	Perform deletes AFTER new uploads when delete-removed

523	523	Enable debug output.
524	524	.TP
525	525	\fB\-\-version\fR
526		Show s3cmd version (2.1.0) and exit.
	526	Show s3cmd version (2.2.0) and exit.
527	527	.TP
528	528	\fB\-F\fR, \fB\-\-follow\-symlinks\fR
529	529	Follow symbolic links as if they are regular files

537	537	\fB\-\-ca\-certs\fR=CA_CERTS_FILE
538	538	Path to SSL CA certificate FILE (instead of system
539	539	default)
	540	.TP
	541	\fB\-\-ssl\-cert\fR=SSL_CLIENT_CERT_FILE
	542	Path to client own SSL certificate CRT_FILE
	543	.TP
	544	\fB\-\-ssl\-key\fR=SSL_CLIENT_KEY_FILE
	545	Path to client own SSL certificate private key
	546	KEY_FILE
540	547	.TP
541	548	\fB\-\-check\-certificate\fR
542	549	Check SSL certificate validity

-1

s3cmd.egg-info/PKG-INFO less more

0	0	Metadata-Version: 1.2
1	1	Name: s3cmd
2		Version: 2.1.0
	2	Version: 2.2.0
3	3	Summary: Command line tool for managing Amazon S3 and CloudFront services
4	4	Home-page: http://s3tools.org
5	5	Author: Michal Ludvig

19	19	Authors:
20	20	--------
21	21	Florent Viard <florent@sodria.com>
	22
22	23	Michal Ludvig <michal@logix.cz>
	24
23	25	Matt Domsch (github.com/mdomsch)
24	26
25	27	Platform: UNKNOWN

-0

s3cmd.egg-info/SOURCES.txt less more

8	8	setup.py
9	9	S3/ACL.py
10	10	S3/AccessLog.py
	11	S3/BaseUtils.py
11	12	S3/BidirMap.py
12	13	S3/CloudFront.py
13	14	S3/Config.py

-0

setup.py less more

82	82	Authors:
83	83	--------
84	84	Florent Viard <florent@sodria.com>
	85
85	86	Michal Ludvig <michal@logix.cz>
	87
86	88	Matt Domsch (github.com/mdomsch)
87	89	""" % (S3.PkgInfo.long_description),
88	90