|
Data Science News
|
Dataphoric: Learn Data Science the Hard Way
|
Dataphoric
from June 27, 2015
So you want to be a Data Scientist? The good news is that there are tons of great resources out there to learn from. The bad? None is comprehensive, and choosing the best can be completely overwhelming. I created this list to help you stay focused on learning what’s important, the easiest way possible.
But it won’t be easy…
|
|
A dataset of every Reddit comment
|
Hacker News
from July 11, 2015
Someone has already put it on Google Big Query – https://bigquery.cloud.google.com/table/fh-bigquery:reddit_comments.2015_05
Link to original Reddit thread – https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/
|
|
Preface to Python Data Science Handbook (Early Release)
|
O'Reilly Publishing, Jake VanderPlas
from July 06, 2015
This is a book about doing data science with Python, which immediately begs the question: what is data science? It’s a surprisingly hard definition to nail down, especially given how ubiquitous the term has become. Vocal critics have variously dismissed the term as a superfluous label – after all, what science doesn’t involve data? – or a simple buzzword which only exists to salt resumes and catch the eye of overzealous tech recruiters.
In my mind, these dismissals miss something important. Data science, despite its hype-laden veneer, is perhaps the best label we have for the cross-disciplinary set of skills that are becoming increasingly important in many applications across industry and academia.
|
|
Statistical Advice for A/B Testing
|
Insight Data Science
from July 09, 2015
A/B testing is awesome. It’s fun, it’s lucrative, and it’s an extremely visible and impactful way that you can create value as a data scientist. It’s both thrilling and deeply satisfying to see a change you proposed make a multi-million dollar difference. If only you could get paid on commission!
Before you write that email to your boss asking for a raise, though, it’s worth making sure that your A/B test evaluation process is correct. It would be… unfortunate if it turned out that your decision to color all your call to action buttons hot pink wasn’t worth the “mad stacks” that you claimed, and was in fact actively harmful. To avoid such embarrassments, you’d like to implement some sound statistical practices for evaluating your A/B tests.
Unfortunately, good statistical methods for A/B testing are more complicated than they might seem at first. (Check out the whitepaper Most Winning A/B Test Results are Illusory and Evan Miller’s How Not to Run An A/B Test for some interesting examples.) Statistical errors are easy to make, and these mistakes can fatally bias your A/B testing program. So here are four recommendations for avoiding some common statistical difficulties, and for achieving a successful and sound A/B test evaluation plan. Happy testing!
|
|
Oracle Data Capture Makes Utilities’ Metering Operations More Efficient · Environmental Leader · Environmental Management News
|
Environmental Leader
from July 08, 2015
Oracle Utilities has launched DataConnect, a data extraction feature for Oracle Utilities Customer Care and Billing and Oracle Utilities Meter Data Management that allows utilities to more easily leverage data across their systems, including those provided by third-party vendors.
This new tool exports customer and usage information for use in downstream applications, enabling utilities to derive greater value from their data and provide new offerings, such as conservation programs and audit tools, which require access to consistent and accurate data.
|
|
Systems biology: Network evolution hinges on history : Nature : Nature Publishing Group
|
Nature
from July 08, 2015
The effects of mutations in proteins can depend on the occurrence of previous mutations. It emerges that such historical contingency is also important during the evolution of gene regulatory networks.
|
|
European labs set sights on continent-wide computing cloud : Nature News & Comment
|
Nature News & Comment
from July 08, 2015
From astronomy to genomics, scientists are increasingly storing and studying their data sets on shared remote ‘cloud’ computing servers, accessed through the Internet. Three of Europe’s biggest research labs now want to help academics by working with commercial firms to create a continent-wide cloud-computing portal — and they are hoping to get backing from the European Commission.
Many researchers find cloud computing to be more flexible and efficient than buying expensive hardware — they can rent servers from firms such as Amazon and Google when they need a burst of power for an intensive computation, for example (see Nature 522, 115–116; 2015). Despite the advantages, some academics are concerned about security and reliability when storing their data on outside servers, says Bob Jones, a computer scientist at CERN, Europe’s particle-physics lab near Geneva, Switzerland.
|
|
Cutting cost and power consumption for big data | MIT News
|
MIT News
from July 10, 2015
… at the International Symposium on Computer Architecture in June, MIT researchers presented a new system that, for several common big-data applications, should make servers using flash memory as efficient as those using conventional RAM, while preserving their power and cost savings.
The researchers also presented experimental evidence showing that, if the servers executing a distributed computation have to go to disk for data even 5 percent of the time, their performance falls to a level that’s comparable with flash, anyway.
In other words, even without the researchers’ new techniques for accelerating data retrieval from flash memory, 40 servers with 10 terabytes’ worth of RAM couldn’t handle a 10.5-terabyte computation any better than 20 servers with 20 terabytes’ worth of flash memory, which would consume only a fraction as much power.
|
|
How SAP plans to bring analytics to soccer | CIO
|
CIO
from July 07, 2015
SAP and City Football Group have announced a global, multi-year partnership that will deliver data analytics to every level of CFG and its international football clubs, from business operations to fan engagement to player and team performance.
|
|
Harnessing computer power to understand biology | Science Careers
|
Science Careers
from July 07, 2015
An ability to combine computation and experimentation to discover new insights about the regulation of gene expression and assembly of protein complexes recently won Sarah Teichmann a 2015 EMBO Gold Medal. The award recognizes outstanding scientific achievements from young researchers in Europe in the field of molecular biology. … Science Careers asked Teichmann how she gained her skills and abilities and what doors they opened to her. This interview has been edited for clarity and brevity.
|
|