The next are the results of the Kaggle survey that I conducted, in which several questions regarding Machine Learning/Data Analysis were asked to participants to extract their personal views on the subject and the tools they used. The online platform on which the survey ran does not offer much analytics beyond copying and pasting aggregated results per question, so here you go:
(the link http://es.surveymonkey.com/s/SYYTCF2 )
TOTAL PARTICIPANTS: 40
1. What is your background?
Biochemistry 0,0% 0
Chemistry 0,0% 0
Computer Engineering (Software Development) 30,0% 12
Computer Science (IA/Machine Learning) 12,5% 5
Econometrics 0,0% 0
Economics 5,0% 2
Engineering (Electrical) 5,0% 2
Engineering (Mechanical) 0,0% 0
Engineering (Other) 0,0% 0
Mathematics 15,0% 6
Medicine 0,0% 0
Physics 7,5% 3
Statistics 12,5% 5
Other (Science Applied) 7,5% 3
Other (Science Pure) 0,0% 0
Other 5,0% 2
2. What is your language of preferred usage for data analysis tasks?
Bash/sed/awk/any shell 0,0% 0
C/C++ 2,5% 1
Excel 0,0% 0
Java 5,0% 2
Maple 0,0% 0
Mathematica 0,0% 0
Matlab/Octave 5,0% 2
Perl 0,0% 0
Python 37,5% 15
R/S-Plus 35,0% 14
SAS 2,5% 1
SPSS 0,0% 0
Stata 0,0% 0
Weka 2,5% 1
Other 10,0% 4
3. Where do you live? (Select the option of your political mainland country: e.g., Canary Islands - Spain - Europe (South) )
America (North - Canada) 2,5% 1
America (North - US) 42,5% 17
America (North - Mexico) 0,0% 0
America (Central) 0,0% 0
America (South - Brazil) 0,0% 0
America (South - Argentina) 0,0% 0
America (South - Others) 0,0% 0
Africa (East) 0,0% 0
Africa (Ecuatorial) 0,0% 0
Africa (Mediterranean including Egypt) 0,0% 0
Africa (Sahara) 0,0% 0
Africa (South Africa) 2,5% 1
Africa (West) 0,0% 0
Asia (China) 0,0% 0
Asia (Japan) 0,0% 0
Asia (Korea) 2,5% 1
Asia (India) 5,0% 2
Asia (Middle East) 2,5% 1
Asia (Europe - Russia) 2,5% 1
Asia (Other) 2,5% 1
Europe (Central) 10,0% 4
Europe (East) 2,5% 1
Europe (Islands) 0,0% 0
Europe (North) 10,0% 4
Europe (South) 5,0% 2
Oceania 10,0% 4
4. Where do you originally come from?
America (North - Canada) 0,0% 0
America (North - US) 35,0% 14
America (North - Mexico) 0,0% 0
America (Central) 0,0% 0
America (South - Brazil) 0,0% 0
America (South - Argentina) 0,0% 0
America (South - Others) 0,0% 0
Africa (East) 0,0% 0
Africa (Ecuatorial) 0,0% 0
Africa (Mediterranean including Egypt) 0,0% 0
Africa (Sahara) 0,0% 0
Africa (South Africa) 2,5% 1
Africa (West) 0,0% 0
Asia (China) 5,0% 2
Asia (Japan) 0,0% 0
Asia (Korea) 2,5% 1
Asia (India) 7,5% 3
Asia (Middle East) 0,0% 0
Asia (Europe - Russia) 2,5% 1
Asia (Other) 2,5% 1
Europe (Central) 10,0% 4
Europe (East) 7,5% 3
Europe (Islands) 0,0% 0
Europe (North) 7,5% 3
Europe (South) 10,0% 4
Oceania 7,5% 3
5. Where did you study?
America (North - Canada) 0,0% 0
America (North - US) 42,5% 17
America (North - Mexico) 0,0% 0
America (Central) 0,0% 0
America (South - Brazil) 0,0% 0
America (South - Argentina) 0,0% 0
America (South - Others) 0,0% 0
Africa (East) 0,0% 0
Africa (Ecuatorial) 0,0% 0
Africa (Mediterranean including Egypt) 0,0% 0
Africa (Sahara) 0,0% 0
Africa (South Africa) 2,5% 1
Africa (West) 0,0% 0
Asia (China) 0,0% 0
Asia (Japan) 0,0% 0
Asia (Korea) 2,5% 1
Asia (India) 7,5% 3
Asia (Middle East) 2,5% 1
Asia (Europe - Russia) 2,5% 1
Asia (Other) 2,5% 1
Europe (Central) 7,5% 3
Europe (East) 2,5% 1
Europe (Islands) 2,5% 1
Europe (North) 10,0% 4
Europe (South) 7,5% 3
Oceania 7,5% 3
6. What are the hardware/software configurations you use? (Mark the hardware you perfrom your data computations on, not the one you have i.e., do not mark GPU if you use it only for gaming and you don't perform data analysis on GPU.
Apple MacIntosh 20,0% 7
Cloud (Amazon) 5,7% 2
Cloud (Other) 0,0% 0
GPU (ATI) 0,0% 0
GPU (Nvidia) 14,3% 5
CPU (AMD/K10) 0,0% 0
CPU (AMD/Bulldozer) 2,9% 1
CPU (AMD/Bobcat) 2,9% 1
CPU (Intel/i3) 5,7% 2
CPU (Intel/i5) 37,1% 13
CPU (Intel/i7) 37,1% 13
CPU (Intel/Ivy Bridge) 8,6% 3
CPU (Intel/Sandy Bridge) 11,4% 4
CPU (Intel/Other) 8,6% 3
CPU (Other) 5,7% 2
7. What OS/browser(s) do you use?
Linux (Chrome) 22,9% 8
Linux (Chrominium) 2,9% 1
Linux (Firefox) 17,1% 6
Linux (Opera) 0,0% 0
Linux (Other) 0,0% 0
OSX (Chrome) 20,0% 7
OSX (Chrominium) 0,0% 0
OSX (Firefox) 0,0% 0
OSX (Other) 0,0% 0
OSX (Safari) 2,9% 1
Windows (Chrome) 54,3% 19
Windows (Chrominium) 0,0% 0
Windows (Firefox) 17,1% 6
Windows (Other) 5,7% 2
Windows (Safari) 0,0% 0
Other OS (Chrome) 0,0% 0
Other OS (Chrominium) 2,9% 1
Other OS (Firefox) 0,0% 0
Other OS (Other) 0,0% 0
Other OS (Safari) 0,0% 0
8. Have you used any Hadoop-related tools for any data analysis?
Cassandra 0,0% 0
Lucene 0,0% 0
Hadoop 77,8% 7
Mahout 22,2% 2
Hama 0,0% 0
HBase 0,0% 0
Hive 22,2% 2
Pig 44,4% 4
9. What is the Machine Learning technique that you generally find most useful for classification/regression?
Adaboost 3,2% 1
Bayesian Networks 3,2% 1
kNN 0,0% 0
Linear Regression (Lasso/ElasticNet) 3,2% 1
Linear Regression (OLS/Ridge/other regularized) 3,2% 1
Linear Regression (Other) 0,0% 0
Linear SVC/SVR 0,0% 0
Logistic Regression 6,5% 2
Naive Bayes 0,0% 0
Neural Networks 12,9% 4
Random Forests 67,7% 21
SVM/SVR (Non-linear kernel) 0,0% 0
10. According to you, Machine Learning is mostly?
Engineering/Algorithmics 14,3% 5
Engineering/Algorithmics and Optimization 34,3% 12
Mathematics 5,7% 2
Optimization 2,9% 1
Physics 0,0% 0
Programming 5,7% 2
Statistics and Probability Theory 37,1% 13
No comments:
Post a Comment