The EU is working on a series of legislative proposals, for example, to ensure risk mitigation around the use of AI, or the protection of fundamental rights, but also to make sure that there is fairness and competition in the digital economy. And the Digital Services Act still under negotiation between the European Commission, member state governments, and the European Parliament, seeks to impose proactive obligations on large gatekeeper tech companies, to basically extend antitrust principles and protect smaller players.
And now at the eleventh hour, the Biden administration through Commerce Secretary, Raimondo, but also a number of senators, is voicing its concern. The political leaders worry that the EU rules would discriminate unfairly against American tech companies and really single them out. But what’s easily overlooked in their statements is that US-based tech companies have grown exceptionally large, and that a law that wishes to put specific obligations on the largest companies would inevitably include many American companies.
The Centers for Disease Control and Prevention on Friday announced it is now publicly logging levels of SARS-CoV-2 found in sewage from around the country. The announcement elevates a growing system for wastewater surveillance that the CDC says will eventually be aimed at other infectious diseases.
The system began as a grassroots research effort in 2020 but has grown to a network of more than 400 wastewater sampling sites nationwide, representing the feces of approximately 53 million Americans. The CDC is now working with 37 states, four cities, and two territories to add more wastewater sampling sites. The health agency expects to have an additional 250 sites online in the coming weeks and more after that in the coming months.
In a press briefing Friday, Dr. Amy Kirby, the CDC’s program lead for the National Wastewater Surveillance System (NWSS), called the sampling a critical early warning system for COVID-19 surges and variants, as well as “a new frontier of infectious disease surveillance in the US.”
High-performance computing is needed for an ever-growing number of tasks — such as image processing or various deep learning applications on neural nets — where one must plow through immense piles of data, and do so reasonably quickly, or else it could take ridiculous amounts of time. It’s widely believed that, in carrying out operations of this sort, there are unavoidable trade-offs between speed and reliability. If speed is the top priority, according to this view, then reliability will likely suffer, and vice versa.
However, a team of researchers, based mainly at MIT, is calling that notion into question, claiming that one can, in fact, have it all. With the new programming language, which they’ve written specifically for high-performance computing, says Amanda Liu, a second-year PhD student at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), “speed and correctness do not have to compete. Instead, they can go together, hand-in-hand, in the programs we write.”
On Feb. 4, the House of Representatives passed a bill aimed at increasing U.S. economic competitiveness with China. Dubbed the America COMPETES Act of 2022, the omnibus bill would devote nearly a quarter of a trillion dollars to subsidize domestic semiconductor manufacturing and research on artificial intelligence, quantum computing and other critical technologies. The House bill incorporates key elements of a bill that passed the Senate last year, which the New York Times has called the “most expansive industrial policy legislation in U.S. history.”
Biden has expressed strong support for seeing the bill enacted into law. In a statement released after passage of the House bill, Biden said that the vote was critical “for outcompeting China and the rest of the world in the 21st century.” The House bill is now expected to head into a conference committee, where Congress will reconcile the differences between the Senate and House bills before sending the legislation to the president for signature.
Google’s location tracking services are the subject of lawsuits in Texas, Washington D.C., Indiana and Washington state.
The lawsuits, filed in January, claim Google tracked users for years, often after users specifically turned off ‘Location History’ or similar features.
“Google’s claims to give consumers ‘control’ and respect their ‘choice’ largely serve to obscure the reality that, regardless of the settings they select, consumers have no option but to allow the Company to collect, store and use their location data,” wrote attorneys for the State of Indiana in their complaint against Google.
University of North Carolina at Chapel Hill, The Daily Tar Heel student newspaper, Kate Carroll
from
Since 2019, University Libraries’ On the Books: Jim Crow and Algorithms of Resistance project has used machine learning technology to digitize every law passed in N.C. during the Jim Crow era and has identified a comprehensive list of Jim Crow laws.
The multi-disciplinary team of UNC legal experts, historians and library specialists used text mining to discern and compile legislation passed between the Reconstruction Era and the Civil Rights Movement.
Now, the team is expanding the initiative with the support of a $400,000 grant from the Andrew W. Mellon Foundation.
In the late 1990s, biologist Andrew Hendry noticed similarly quick changes in phenotype while studying salmon. (Phenotype refers to the trait that actually exists in the animal, even if it’s not reflected by a change in its underlying genetic code.) “We had this impression that well, actually, maybe this rapid evolution thing is not so exceptional,” says Hendry, now a professor at McGill University in Montreal. “Maybe it’s actually occurring all the time, and people just haven’t emphasized it.”
With a colleague, Michael Kinnison (now at the University of Maine), Hendry pulled together a database of examples of rapid evolution and wrote a 1999 paper that kickstarted interest in the field. Now, Hendry and colleagues have updated and expanded the original data set with more than 5,000 additional examples: everything from the cranial depth of the common chaffinch to the lifespan of the Trinidadian guppy. Scientists are using this data to answer questions about how fast and far the natural world is changing, and how much of the change is due to humans.
In an initial paper published in November 2021 using the new data set (which is called Proceed, for Phenotypic Rates of Change Evolutionary and Ecological Database), Hendry and colleagues reexamined five key questions raised by previous work. They confirmed, for instance, that on average, all over the world, animal species seem to be getting smaller. This runs contrary to a theory of evolution called Cope’s rule, which posits that species should increase in size over time.
There are some important (and great!) changes in the review process at @icmlconf 2022.
https://icml.cc/Conferences/2022/ReviewForm
– It seems that there are at least 2 phases
– Papers with 2 negative recommendations in phase 1 are rejected
– But, Meta-reviewer can reverse this outcome
SMU Alumnus Julian LaNeve won $100,000 from the 2021 Data Open Championship in December, alongside teammates from Duke and UC Berkeley.
Sponsored by Citadel LLC and Citadel Securities in partnership with Correlation One, the Data Open Championship is the largest and most prestigious university-level data science competition in the world. During the week-long competition, participants team up to work through a dataset and present their findings to a panel of judges.
Months after saying numbers from a 1-year version of a widely-utilized survey measuring how Americans live wasn’t usable because of problems from the pandemic, U.S. Census Bureau officials said Monday that data from a 5-year version of the survey meet its standards and will be released next month.
The statistical agency said it would release the 2020 American Community Survey 5-year estimates in mid-March. In October, the Census Bureau released the survey’s 1-year estimates only in an experimental format with a warning that it may not meet the agency’s statistical quality standards.
The survey typically relies on responses from 3.5 million households on questions about commuting times, internet access, family life, income, education levels, disabilities, military service and employment, but disruptions caused by the pandemic produced fewer responses.
The 1-year estimates provide information for a single year on places only with at least 65,000 people. The 5-year estimates offer data at smaller geographies and are aggregated over multiple years.
Just a few weeks into 2022, we’re already learning about what the year has in store for us. It’s not too late to take a look back, and ahead. In a recent episode of The Data Wranglers podcast, Joe Hellerstein and I did just that. We identified three drivers of changes in how we look at data, and it’s worth bringing these thoughts to the page, too. … These were our three big takeaways from data in 2021:
On January 26, 2022, the new Chief Information Officer (CIO) of the U.S. Department of Defense (DoD), John B. Sherman, released a memo to the entire Department titled “Software Development and Open Source Software”. In this memo, the CIO addresses two primary concerns: 1) using open source software (OSS) introduces supply chain risks for DoD software programs, and 2) sharing DoD code via open source channels without proper checks enables potential leaks of proprietary DoD information to adversaries. In laying out how these two concerns should be addressed properly, the CIO categorizes OSS into a unique position, one which can be utilized by OSS foundations and project maintainers to gain funding for their essential contributions.
The Texas A&M Energy Institute and the Texas A&M Institute of Data Science (TAMIDS) have signed a formal agreement to update two current certificate programs and to add a third, the Division of Research announced today.
Under the agreement, the institutes plan to work on at least two major projects. One will upgrade the Energy Institute’s existing Master of Science in Energy and Certificate in Energy programs by adding a unit on “energy digitization,” the intersection between data science and the world’s energy sector. The other will develop a Certificate in Data Sciences and Energy program that the institutes will administer together.
ranklin College students will soon be able to pursue data science, a new major that will help facilitate entry into high-end technological fields.
Students were already graduating from Franklin College and pursuing jobs in areas similar to data science, so adding the data science major was a way to ease the process of finding placement in that specific field, said Kristin Flora, the college’s dean and vice president of academic affairs.
“It was primarily put together by Stacy Hoehn, associate professor of mathematics, as part of her sabbatical project. She researched the curricula of other schools who host a data science major. With that, she learned many Franklin College students found their way into the field of data science, working at Facebook, Eli Lilly, Stubhub and Salesforce. Maybe we can give students a better foundation through a data science major that gives them more direct entry into the field rather than having them double major or pursue a career in graduate training. Seeing the path our students took and the national demand for data scientists increasing, it made sense for us to pursue it,” Flora said.
A bill filed in the Kentucky House of Representatives would require public universities in the state to freeze tuition for four years with each incoming class. HB452, the Kentucky Student Tuition Protection and Accountability Act filed by Rep. William Lawrence (R-Maysville), would require all public, four-year institutions funded by the state to set tuition and fees for each incoming class, and then freeze those rates for four years. The freeze would apply to in-state students who enroll at the institutions.
Eric Lander, President Biden’s science adviser, has apologized for speaking to White House Office of Science and Technology Policy staff in “a disrespectful or demeaning way,” according to a note he sent to OSTP staff this weekend.
The big picture: An investigation found that Lander violated the White House’s workplace policy and “corrective action” was taken, according to a OSTP spokesperson.
Registration closing soon!! Travel support available. Discounted hotel block closes February 13th!
SPONSORED CONTENT
The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.
Twitter, University of Michigan School of Information
from
Fun Size is a free newsletter of bite-size technology, information and library news, from the University of Michigan School of Information. Subscribe today
Whether for use in cybersecurity, gaming or scientific simulation, the world needs true random numbers, but generating them is harder than one might think. But a group of Brown University physicists has developed a technique that can potentially generate millions of random digits per second by harnessing the behavior of skyrmions — tiny magnetic anomalies that arise in certain two-dimensional materials.
Their research, published in Nature Communications, reveals previously unexplored dynamics of single skyrmions, the researchers say. Discovered around a half-decade ago, skyrmions have sparked interest in physics as a path toward next-generation computing devices that take advantage of the magnetic properties of particles — a field known as spintronics.
Language Models (LMs) often cannot be deployed because of their potential to harm users in ways that are hard to predict in advance. Prior work identifies harmful behaviors before deployment by using human annotators to hand-write test cases. However, human annotation is expensive, limiting the number and diversity of test cases. In this work, we automatically find cases where a target LM behaves in a harmful way, by generating test cases (“red teaming”) using another LM. We evaluate the target LM’s replies to generated test questions using a classifier trained to detect offensive content, uncovering tens of thousands of offensive replies in a 280B parameter LM chatbot. We explore several methods, from zero-shot generation to reinforcement learning, for generating test cases with varying levels of diversity and difficulty. Furthermore, we use prompt engineering to control LM-generated test cases to uncover a variety of other harms, automatically finding groups of people that the chatbot discusses in offensive ways, personal and hospital phone numbers generated as the chatbot’s own contact info, leakage of private training data in generated text, and harms that occur over the course of a conversation. Overall, LM-based red teaming is one promising tool (among many needed) for finding and fixing diverse, undesirable LM behaviors before impacting users.