How to serve static file with a hebrew name in python bottle?

I receive a request from the client to download some file from the server. The filename is in Hebrew.

@bottle.get("/topic/download/<folder_name>/<file_name>")
def download(folder_name, file_name):

    file_name =  file_name.decode('utf-8')
    folder_name =  folder_name.decode('utf-8')

    if os.path.exists(os.path.join(folder_name, file_name)):
        return bottle.static_file(file_name, root=folder_name, download=True)

The last line fails :

return bottle.static_file(file_name, root=folder_name, download=True)

I get an exception :

UnicodeEncodeError: 'ascii' codec can't encode characters in position 22-25: ordinal not in range(128)

I have no idea what am i doing wrong here.

Callstack shows the exception derives from python bottle code:

File "C:Python27Libsite-packagesbottle-0.10.9-py2.7.eggbottle.py", line 1669, in __setitem__
  def __setitem__(self, key, value): self.dict[_hkey(key)] = [str(value)]

Please help.

Regard, Omer.

Bottle is trying to set the Content-Disposition header on the HTTP response to attachment; filename=.... This doesn't work for non-ASCII characters, as Bottle handles HTTP headers with str internally... but then even if it didn't, there's no cross-browser-compatible way to set a Content-Disposition with a non-ASCII filename. (Background.)

You could set download='...' to a safe ASCII-only string to override Bottle's default guess (which is using the local filename, containing Unicode).

Alternatively, omit the download argument and rely on the browser guessing the filename from the end of the URL. (This is the only widely compatible way to get a Unicode download filename.) Unfortunately then Bottle will omit Content-Disposition completely, so consider altering the headers on the returned response to include plain Content-Disposition: attachment without a filename. Or perhaps you don't care, if the Content-Type is one that will always get downloaded anyway.

the unicode characters u'xce0' and u'xc9' do not have any corresponding ascii values. so, if you don't want to lose data, you have to encode that data in some way that's valid as ascii. options include:

>>> print s.encode('ascii', errors='backslashreplace')
abraxc3o josxc9
>>> print s.encode('ascii', errors='xmlcharrefreplace')
abra&#195;o jos&#201;
>>> print s.encode('unicode-escape')
abraxc3o josxc9
>>> print s.encode('punycode')
abrao jos-jta5e

all of these are ascii strings, and contain all of the information from your original unicode string (so they can all be reversed without loss of data), but none of them are all that pretty for an end-user (and none of them can be reversed just by decode('ascii')).

see str.encode, python specific encodings, and unicode howto for more info.


as a side note, when some people say "ascii", they really don't mean "ascii" but rather "any 8-bit character set that's a superset of ascii" or "some particular 8-bit character set that i have in mind". if that's what you meant, the solution is to encode to the right 8-bit character set:

>>> s.encode('utf-8')
'abraxc3x83o josxc3x89'
>>> s.encode('cp1252')
'abraxc3o josxc9'
>>> s.encode('iso-8859-15')
'abraxc3o josxc9'

the hard part is knowing which character set you meant. if you're writing both the code that produces the 8-bit strings and the code that consumes it, and you don't know any better, you meant utf-8. if the code that consumes the 8-bit strings is, say, the open function or a web browser that you're serving a page to or something else, things are more complicated, and there's no easy answer without a lot more information.

delete static_path from the app settings.

then set your handler like:

handlers = [
            (r'/(favicon.ico)', tornado.web.staticfilehandler, {'path': favicon_path_dir}),
            (r'/static/(.*)', tornado.web.staticfilehandler, {'path': static_path_dir}),
            (r'/', webhandler)
]

as indicated in the documentation, you should serve static files using the static function and css is a static file. the static function handles security and some other function which you can find out from the source. the path argument to the static function should point to the directory wherever you store the css files

as liam kelly commented, the snippets from this post should work. using cgi.fieldstorage makes it possible to easily send file metadata without explicitly sending it. a klein/twisted approach would look something like this:

from cgi import fieldstorage
from klein import klein
from werkzeug import secure_filename

app = klein()

@app.route('/')
def formpage(request):
    return '''
    <form action="/topic/images" enctype="multipart/form-data" method="post">
    <p>
        please specify a file, or a set of files:<br>
        <input type="file" name="datafile" size="40">
    </p>
    <div>
        <input type="submit" value="send">
    </div>
    </form>
    '''

@app.route('/images', methods=['post'])
def processimages(request):
    method = request.method.decode('utf-8').upper()
    content_type = request.getheader('content-type')

    img = fieldstorage(
        fp = request.content,
        headers = request.getallheaders(),
        environ = {'request_method': method, 'content_type': content_type})
    name = secure_filename(img[b'datafile'].filename)

    with open(name, 'wb') as fileoutput:
        # fileoutput.write(img['datafile'].value)
        fileoutput.write(request.args[b'datafile'][0])

app.run('localhost', 8000)

for whatever reason, my python 3.4 (ubuntu 14.04) version of cgi.fieldstorage doesn't return the correct results. i tested this on python 2.7.11 and it works fine. with that being said, you could also collect the filename and other metadata on the frontend and send them in an ajax call to klein. this way you won't have to do too much processing on the backend (which is usually a good thing). alternatively, you could figure out how to use the utilities provided by werkzeug. the functions werkzeug.secure_filename and request.files (ie. filestorage) aren't particularly difficult to implement or recreate.


Tags: Python Unicode Encoding Bottle Static Files