JISC report on data curation
Okay, okay, so I’m finally going to have to admit—reluctantly—that most data curators have domain expertise, and that that’s the most desirable situation for researchers.
However, the Swan and Brown report offers even comp-lit majors like me a little hope:
On the other side of the coin, there are data scientists who argue that it is not necessary to be a subject expert in order to do the job effectively. There are some fundamental data science skills that are generic in nature, such as dealing with confidential research, data description and metadata, software, copyright and intellectual property rights, and data storage. Although this is may be so, the core issue is that of effective communication between a data scientist and their research colleagues…
From a practical perspective, as demand for competent data scientists grows, so it will become necessary to cast the net as wide as possible. Subject knowledge is important, but so too are technical skills and people skills…
We must consider also the question of technical and computing aptitude…
Our online survey of current data scientists also showed that the data science community is evenly split on whether people skills are more important than technical skills for success as a data scientist – but then people’s opinions are often predicated on their own experiences and whether their own strengths lied toward the technical or people skills end of the spectrum. It is uncommon to find people who are excellent at both. We came across several examples of instances where people whose background was primarily computing and information technology became sufficiently familiar with the subject area of their specialist institutions that they were deemed to be effective data scientists.
In my mind, I compare this to the situation of librarians who become selectors or bibliographers in subject areas where they have no formal training. Let’s not kid ourselves, it happens, especially in the sciences. I’ve known some such—and paradoxically, they tended to be toward the more effective end of the scale. Admittedly, this is because they were people with courage sufficient to dive into an unfamiliar topic head-first, and such people tend to be naturally effective at whatever they turn their hand to. Sometimes, though, being buried in the subject is a positive disadvantage. Ever had a foreign-language teacher who was a native speaker, so embedded in the language that s/he couldn’t explain it? I have. I think this happens to scientists a lot.
I’m still reading the report, which is an evenhanded and intelligent one. I quibble with the idea that rigid terminology distinctions are appropriate at this early date, but I think the lines drawn in the report are useful ways to think about the problem as long as they’re not meant to reify it. I have most of the skills of the report’s “data librarian,” but some of the “data manager”’s skills as well. This is not a bad thing. It should be encouraged!