Despite not having access to a suitable environment at home, I decided to enter a new Kaggle competition. The StumbleUpon Evergreen Classification Challenge seems to be easy to tackle since it is a classic binary classification problem with text features and numerical features.
I decided to do it on the cloud. For that purpose, one needs to load the data distributed by Kaggle into the Amazon EC2 instance. Kaggle will prevent any connection from there, since they require you to log in to access the data. No problem, it is the cookies which do the work, and we are going to use them from the EC2 instance, as they commented here
The first thing we need is a plugin to save the cookies into a text file. Use this for Firefox, and this for Chrome.
Then, we upload the file to the EC2 instance with some means. In my case I use Bittorrent Sync (a post will be coming later on). We tell wget to use the cookies with the option --load-cookies as this:
wget -x --load-cookies ~/BTSync/cookies.txt http://www.kaggle.com/c/stumbleupon/download/raw_content.zip
We get an output such as this, and we have successfully loaded the data:
ubuntu@ip-172-31-21-138:~/kaggle/evergreen wget -x --load-cookies ~/BTSync/cookies.txt http://www.kaggle.com/c/stumbleupon/download/raw_content.zip --2013-09-09 22:37:17-- http://www.kaggle.com/c/stumbleupon/download/raw_content.zip Resolving www.kaggle.com (www.kaggle.com)... 168.62.224.124 Connecting to www.kaggle.com (www.kaggle.com)|168.62.224.124|:80... connected. HTTP request sent, awaiting response... 302 Found Location: https://kaggle2.blob.core.windows.net/competitions-data/kaggle/3526/raw_content.zip?sv=2012-02-12&se=2013-09-12T22%3A37%3A18Z&sr=b&sp=r&sig=qAJZIFUmRu%2B9XX%2FM%2B7qPorR%2FkWAC7%2B9W6MEWL5xM0fg%3D [following] --2013-09-09 22:37:18-- https://kaggle2.blob.core.windows.net/competitions-data/kaggle/3526/raw_content.zip?sv=2012-02-12&se=2013-09-12T22%3A37%3A18Z&sr=b&sp=r&sig=qAJZIFUmRu%2B9XX%2FM%2B7qPorR%2FkWAC7%2B9W6MEWL5xM0fg%3D Resolving kaggle2.blob.core.windows.net (kaggle2.blob.core.windows.net)... 65.52.106.46 Connecting to kaggle2.blob.core.windows.net (kaggle2.blob.core.windows.net)|65.52.106.46|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 164757969 (157M) [application/zip] Saving to: ‘www.kaggle.com/c/stumbleupon/download/raw_content.zip’ 100%[======================================>] 164,757,969 2.29MB/s in 95s 2013-09-09 22:38:53 (1.65 MB/s) - ‘www.kaggle.com/c/stumbleupon/download/raw_content.zip’ saved [164757969/164757969] ubuntu@ip-172-31-21-138:~/kaggle/evergreen
Thank you, very useful.
ReplyDeletesmm panel
ReplyDeletesmm panel
iş ilanları
İnstagram Takipçi Satın Al
Hırdavatçı Burada
beyazesyateknikservisi.com.tr
servis
tiktok jeton hilesi