Report says data science education pays off in rewarding careers

data-careersEarning a college degree takes more than time and effort—it requires a significant financial investment. With college costs on the rise, how can a prospective student ensure he or she gets the best bang for their education bucks?

According to a report from Thrivent Mutual Funds highlighted in a recent edition of Forbes magazine, choosing a major that can translate into a data science career is one way to ensure that your career earning power will allow you to pay off those student loans quickly. The Thrivent Mutual Funds 100 Best Careers in America categorizes jobs by the amount of education needed to land a job in a field: an associate’s degree or certification, a four-year degree, or an advanced degree.   Continue reading

Experian solicits input from Big Data Hubs Community

Experian is soliciting input from the Big Data Innovation Hubs regarding our community’s data needs. To support academic research and new use cases that may involve industry, Experian has shared information on data sets they maintain, including the following:

If these data sets may be of use to your research and/or you would like more information, please send inquiries and comments to ken2115@columbia.edu and also cc info@southbdhub.org

Big Data Innovation Hubs Selected for NTIS Joint Venture Partnership

joint-partners_1Partnerships called “a major milestone for the data economy”

The four Big Data Regional Innovation Hubs have been selected by the National Technical Information Service (NTIS) of the U.S. Department of Commerce to enter into a Joint Venture Partnership. Once finalized, this partnership will provide opportunities for collaborations between the Big Data Hubs and NTIS to deliver groundbreaking data projects across federal agencies.   Continue reading

PyData Carolinas offers tools and tips for bioinformatics research

cwhite_photo

Clarence White, PhD student at North Carolina A&T University

PyData Carolinas 2016 brought together hundreds of professionals, researchers and students interested in data analysis to discuss how best to apply Python tools to meet challenges in data management, processing, analytics and visualization.  Among the attendees was Clarence White, one of two students from North Carolina A & T who was sponsored by the South Big Data Hub to attend. The Hub was also a silver sponsor of PyData Carolinas.  Below are Clarence’s thoughts on the conference.

My name is Clarence White, a Ph.D. student in computational science and engineering at North Carolina A&T State University.  In my research, I’m working on applying machine learning methods to bioinformatics problems.  Some areas of interest to me have been beta lactamase and phosphorylation site prediction. Beta lactamase is one of the main reasons behind the development of antibiotic resistance among pathogenic bacteria, and protein phosphorylation plays an important role in a wide range of cellular processes.

Continue reading

South Big Data Hub announces awards that apply data science to regional challenges

Model Release-NO

Ashok Goel of the Georgia Institute of Technology is principal investigator for one of the three research teams that will receive Spoke awards from the South Big Data Hub.  (photo courtesy of Georgia Tech)

Awards part of $11 million in National Science Foundation Big Data Hub “Spoke” awards

 Three research teams in the Southern U.S. will receive funding for projects designed to use data science and data analytics to address challenges related to healthcare, environmental sustainability, and updating and improving power grids. The funding will be awarded through the “Big Data Spokes” program of the National Science Foundation’s (NSF) Big Data Regional Innovation Hub initiative. Continue reading

Learning the nuts and bolts of data integration: A Data Start Fellow’s perspective

aziz_eram

Aziz Eram reflects on her DataStart experience.

Aziz Eram, a master’s student at the University of Arkansas at Little Rock studying information quality, is one of six graduate students who participated in the South Big Data Hub’s DataStart Program. DataStart provides funding that allows talented graduate students to work as student fellows with startups who need data science talent. She served her summer fellowship with Black Oak Analytics in Little Rock. Below are her thoughts about the program.

My name is Aziz Eram and I had the opportunity to intern at Black Oak Analytics, a Little Rock data startup, through a DataStart Fellowship managed by the South Big Data Hub. I did not come to the program with any industry knowledge, but I have a bachelor’s degree in computer science and statistics and also a master’s in applied mathematics. I was excited to be hired as an intern at Black Oak and to say that I have learned a lot in my internship is an understatement. I have grown tremendously, learning foundational data mining and data-driven marketing skills. Black Oak Analytics is a company that provides advanced solutions that allow organizations of any size to convert data into recommendations and actions designed to improve profitability, competitiveness, and customer satisfaction. Continue reading

NIH BD2K offers lecture series on fundamentals of data science

As big data becomes ubiquitous in research and business, more and more people are finding they need guidance on how to make the most of their data and follow best practices. The National Institutes of Health (NIH) Big Data to Knowledge (BD2K) initiative recognizes that for biomedical researchers and clinicians to take full advantage of the data revolution they need…well, data—in the form of training, guidelines, expert advise, use cases, and more.

To meet this need, the BD2K now offers a virtual lecture series on the data science underlying biomedical research, featuring weekly presentations from experts on the fundamentals of data management, representation, computation, statistical inference, data modeling, and other topics relevant to big data in biomedicine. The BD2K Guide to the Fundamentals of Data Science Series offers live streaming presentations every Friday from noon to 1 p.m. Eastern time. The presentations are also recorded and posted online for future viewing and reference.

Two sessions are already online: Introduction to Big Data and the Data Lifecycle, and Data Indexing and Retrieval. They can be viewed on YouTube here. The next live presentation, called Finding and Accessing Data Sets, Indexing and Identifiers, will be held Sept. 23 and will feature Lucila Ohno-Machado, MD, PhD, and chair of the department of biomedical informatics at the University of California at San Diego.

There is no cost for attending or viewing a presentation and no registration is required. For more information about the series, including a list of upcoming lectures, visit the BD2K Training Coordinating Center website.

 

 

South Hub Sponsors Materials and Advanced Manufacturing Workshop

materials_speakers

On August 25, nearly sixty people gathered for a workshop on Data Infrastructure for Materials and Advanced Manufacturing. Attendees came from throughout the southern US to attend the event, sponsored by the South Big Data Hub and the Computing Community Consortium, to assess and deliberate on the current state of the data infrastructure supporting the accelerated insertion of new and advanced materials into commercial products.

Stakeholders from industry, academia, national laboratories, and nonprofits convened to share their perspectives on challenges surrounding the use of data and informatics in materials discovery and development, and advanced manufacturing. The expertise of participants spanned materials science and engineering, design and manufacturing sciences, and computer and data sciences.

Speakers from industry included Rick Barto of Lockheed-Martin, Kaisheng Wu of Thermo-Calc, Bryce Meredig of Citrine Informatics, Ramesh Subramanian of Siemens, and Rajiv Naik of Pratt & Whitney. In addition, Chuck Ward from the Air Force Research Laboratory and Turab Lookman from Los Alamos National Laboratory also presented from their perspectives.

Following the talks, a series of smaller concurrent breakout sessions formed to discuss feasible crossover areas between industry and academic research. Michael Valley of Sandia National Laboratories moderated the session “high impact applications of data science in the materials-manufacturing sector.” Daniel Wheeler from the National Institute of Standards and Technology moderated a discussion on “challenges in the automation of the materials data life-cycle.” Raymundo Arroyave from Texas A&M University moderated a session on “education and training in materials-manufacturing data science and informatics.” David Fries of the Florida Institute for Human & Machine Cognition was the moderator of a discussion on “the materials-manufacturing innovation cyber-ecosystem.”

The whole group then reconvened for two all-inclusive round table discussions. Jason Hattrick-Simpers from the University of South Carolina, and David McDowell of Georgia Tech drove a discussion on developing a set of objectives and an associated roadmap for those. Surya Kalidindi of Georgia Tech led a discussion on establishing an advanced materials and manufacturing “Spoke” at the South Big Data Hub.

After a reception poster session, the event closed with a call to action to collect resources, create an online community for locating resources and for networking, and to develop an administration transition paper. Co-Executive Director Renata Rawlings Goss is currently seeking leadership roles in developing resources for this new community. To participate, please contact her at rrawlings.goss@gatech.edu.

DataBridge tackles the problem of ‘dark data’

databridge-logo-final-copyDataBridge, a National Science Foundation-funded project to make research data more discoverable and usable by a wide community of scientists, has the green light to expand its work into the neuroscience community, thanks to a new NSF EAGER award.

The award itself is relatively small (less than $100,000) and will allow the researchers to consult with neuroscientists, develop a prototype DataBridge for Neuroscience (DBfN), and a community workshop. However, the impact could be significant for a hot scientific field that is making breakthrough discoveries about the human brain. Continue reading

Startup Plus Grad Student Internship Proves to be a Winning Combination

Combine young companies led by bright entrepreneurs with talented graduate students eager to use their skills and something good is bound to happen.

A case in point is data.world, an Austin, Texas-based startup dedicated to making data (such as U.S. Census data) more accessible and usable; and Jonathan Ortiz, a graduate student in the data analytics program at the University of Texas at Austin.

Continue reading