+45 70 27 40 08

sales@meeho.net

» Ruby – how to detect the encoding of a string

Posted by Kasper Tidemann on Monday 22nd of March 2010 10:51:59 PM

With file uploads in Ruby on Rails, e.g. an upload of a 2 KB CSV file, you’ll often run into trouble trying to decipher the encoding of the Tempfile string data stored in params[:my_upload_form][:uploaded_file] or whatever you’ve named your input field.

If you want to keep everything to one encoding, you could make use of Iconv.conv(‘UTF-8′, <whatever encoding>, string) to convert the data from the input field to UTF-8. But to make the iconv() wrapper work properly, it needs to know what to convert from… So how do you acquire this knowledge?

Try to use the Ruby gem rchardet by Jeff Hodges. Here is an example of how to use it:

require ‘rchardet’

[...]

cd = CharDet.detect(params[:my_upload_form][:uploaded_file])
encoding = cd['encoding']

converted_string = Iconv.conv(‘UTF-8′, encoding, params[:my_upload_form][:uploaded_file])

The above is not bullet proof, but it’ll get you going. If you have alternative ideas in this regard, please comment to let us all know.

 

1

Bráulio Bhavamitra says:

worked, thanks!

 

If you have something to say, feel free to drop a comment below:

Type in your name here

Type in your e-mail address here

Type in your web site address here

 

+45 70 27 40 08
Open office hours

info@meeho.net

support@meeho.net

News from the Meeho!™ Blog:

02/14 2011 » Fixing the "NoMethodError: undefined method ‘to_sym' for false:FalseClass" error when working with I18n in Ruby on Rails

02/03 2011 » Soon to come: IMAP integration

01/27 2011 » Meeho!™ App version 1.0.3 is out!

01/26 2011 » Cool new iPad stand

We live at Diplomvej 381, 2800 Kgs. Lyngby, Denmark: