Wednesday 6 October 2010

Experiences with SharePoint/Groove?

One of the issues arising from this project is the practicalities of working either off-site or across institutions, namely how can researchers facilitate effective control of data and data collections when they are based in different locations or working away from their institution. One potential solution that has been suggested in the course of our data management interviews is Microsoft SharePoint 2010/Groove. I am interested in following up on this. So, does anyone have experience of working with SharePoint 2010/Groove? I'm interested in hearing about it's performance from a data management perspective - particularly in terms of shared workspaces, remote access, version control, and data security. Does it work? Does it work well? Is it easy to use and adapt to? What doesn't work well, what are the bugs? What could it do better? I'd be grateful for any insights and testimonials.

Friday 27 August 2010

Last question...social science?

Social science
"the study of society and the manner in which people behave and impact on the world around us and includes disciplines such as economics, law, sociology, psychology, business studies, education, politics and international studies."

UK Strategy for Data Resources for Social and Economic Research 2009-2012, p.8

Thursday 26 August 2010

And a data collection is...?

Data collection
A data collection is typically comprised of three components: data, documentation and metadata. Occasionally, a fourth component of code exists. Data collections are typically organised by reference to a particular survey or research topic and cover a specific geographic area and time period.

UK Data Archive (2010), UK Data Archive Preservation Policy, pp.14-15

Wednesday 25 August 2010

Well, what's documentation then?

Documentation
Documentation is that portion of a data collection that is required in order to re-use data. It commonly covers the subjects of sampling design, methods of data collection, questionnaire/interview design, structure of the data files, lists of variables and coding schemes, details of weighting, confidentiality and anonymisation, and provenance of any secondary data used. It also includes licence arrangements and all materials obtained through the original negotiation and data deposit, as well as post-deposit information created during preservation and ingest activities. The terms metadata and documentation are often used interchangeably and there is overlap between the two, though documentation tends to have a structure that is specific to each data collection.

UK Data Archive (2010), UK Data Archive Preservation Policy, pp.14-15

Tuesday 24 August 2010

Ok, so data mangagement...fine, but what is/are data?

Data
Data are all the material, regardless of format, which are intended to be analysed. As part of datasets, they are the primary element of a data collection. More precise definitions of data vary according to context. Quantitative data may refer to just the matrices of numbers or words that comprise a data file, but may also cover other information (metadata) held within a statistical package data file, such as variable labels, code labels and missing value definitions. Qualitative data might include interview transcripts as well as audio and video recordings (analogue or digital).

UK Data Archive (2010), UK Data Archive Preservation Policy, pp.14-15

Monday 23 August 2010

Progress report

This blog was intended as an experiement. The problem I've found in maintaining it was that it was difficult to be informative about the progress of the project and the challenges and problems we were encountering, and maintain a level of confidentiality as to who and where we were encountering these challenges and problems. Twitter takes care of the informative aspect, while this blog was seeming more like a commentary on the standard of spreads and hospitality provided by centres and programmes (which by the way has been excellent).

 
However, we are around mid-point. Last week our progress report was approved and arising from it we reported three main themes emerging from project in terms of outputs and training.
  • Data ownership. A lack of awareness about who owns primary data, and a lack of consideration about the implications of using secondary data in terms of licences and copyright.
  • A need to devise strategies and tools for working across institutions.
  • Difficulty in getting good data from centres and programmes on data management costs.
For a richer explination of these themes we have produced a report on current data management practices in the social sciences

 
Our next challenge is to devise centre specific strategies to address these themes, but stratagies that can also have a generic application for social science data investments.

Thursday 1 April 2010

Meeting CRESC

Today we visited the Centre for Research on Socio-Cultural Change (CRESC) at the University of Manchester. CRESC is an ESRC funded Research Centre analysing socio-cultural change. Its research projects cover quantitative (longitudinal survey analysis) and qualitative (ethnography, interviewing, audio and visual data).

Three interesting issues emerged from our meeting that create interesting areas to consider. First, data collection is not central to its mission. Outside of qualitative work, their projects tend to generate data from existing data; this produces a lot of "derived" data. This led to an ongoing discussion on the question of what counts as data.

The second issue was that the organisational attitude within CRESC is that it is project governed. This is a reflection of the nature of the centre in working across different disciplines with different research norms and values.

Finally, there is CRESC's own interest in the theme of the "social life of methods". This looks at how methods are themselves an agent of social change. An interest was expressed in following our project closely; effectively studying us. Interesting.

Monday 22 March 2010

Meeting the TSRC

We travelled to the University of Birmingham for an introductory meeting with the Third Sector Research Centre (TSRC). The TSRC are one of the ESRC research centres we are working with as part of our project.

Overall, it was a positive meeting. The three main topics our discussion covered were intellectual property rights, consent and anonymisation, and administrative organisation.

There is enthusiasm from the TSRC to get a data management plan established and embedded soon. Most of their research is starting now and they want to have good procedures in place and avoid later data problems. Our next task is to work with the TSRC to undertake a data audit in April/May with training taking place in June.

Thursday 25 February 2010

Visit from the DCC

Yesterday we were visited by some representatives from the Digital Curation Centre. Their visit was intended to provide an introduction, guidance and support to data management tools they have developed. The visit covered the Data Audit Framework (DAF), Assessing Institutional Digital Assets (AIDA) and Keeping Research Data Safe 2 (KRDS2). The final hour was devoted to a discussion on where the Data Management Planning for ESRC Research Data Rich Investments project can learn from and contribute to the DCC.

What I took from this meeting was that there are already some very good data management tools in existence. These can be used, refined, and adapted and offered back to the data management community. For example, our remit is to consider data management practices at centre and programme level - effectively looking at cross institutional, multi-research data. This will identify issues that would not exist at departmental level, within institutions, or within single research projects. My sense from this was that there is no generic solution for all scientific and social research.

It was rewarding to meet others in the data management community, and discuss our relationship within that community.

Thursday 11 February 2010

Meetings

Yesterday we had a meeting with the programme manager from JISC and our colleague on the project team from the ESRC. One initial aim was to introduce each other in person, and establish where this project fits into JISC's wider programme on data management and how it contributes to estimating the costing of managing data and promoting a culture of sharing.

It was an interesting and wide-ranging meeting, but the essential points for progress on this project were to sign-off on our project plan and approve our choices for cooperation and the level of involvement with the centres and projects that have expressed an interest in working with us. Following from this, we have already drafted a letter that will be sent by the ESRC to the centres and programme.

The next stage, once that letter has been sent is to set up a meeting with the centres and programmes. As was agreed, we can approach them informally to establish possible dates because the timing is tight. At this meeting we can then work out an memorandum of understanding with the centre or programme and its ESRC case officer, as to the precise nature of cooperation and involvement.

One final, minor but relevant point. Blogging is encouraged as an informal form of monitoring and updating on progress. So, keep watching this space.

Monday 8 February 2010

Defining the field

This morning we met to narrow the centres and programmes we will work with. The plan was to select one programme and two or three centres with which to work closely and to work to a lesser degrees with the remainder that expressed an interest in cooperation. This may involve interviews with principal investigator and/or a retrospective look at their data management practice. We now have identified some centres and programmes that cover some of the diverse issues in data management such as complex, sensitive, confidential, and commercial data. The next stage is to contact them and begin to sort out a meeting.
Following from this I worked up a spreadsheet to track our progress working with these centres and programme.
Finally for today, I started putting together an interview schedule/questionnaire to discover existing data management practices. It's very much a work in progress, but is based on existing guidance from RELU, Timescapes, Teaching and Learning Resource Programme and our own data management guidance document.

Friday 29 January 2010

Filling in gaps.

Another week over. But the end of today the project will have its project plan sent to JISC, it outlines our aims and objectives, overall approach and outcomes. Looking at the timetable, we are off to a fast start with plenty to be completed by the end of April. The first objective? Gathering eveidence of existing data management practives in past and present research centres and programmes. So, the next task for me is to fill in some details on ESRC programmes and centres so we can begin to define who, and how, we are going to work with. Things I need to establish are the types of research and data collected, the re-use potential of that data and how they link to some of the priorities we are interested in, be it complex, sensitive, confidential, or commercial data. Finally, seeking to establish what their existing data management and sharing practices.

Thursday 21 January 2010

Positive responses, project plans

Critical to this project is the cooperation of ESRC funded Programmes and Research Centres. Part of the project involves travelling to a few of these programmes or research centres to learn and assess how they go about managing data. So, it's a good sign that having contacted a range of programmes and centres last week, we have had a number of positive responses to our involvement. Of course, the details have to be sorted and official approvals agreed.

Meanwhile, at the end of this month the project needs to submit our budget and plans to JISC. So, I have spent the majority of time so far this week preparing material for submission as well as finally having gained access to edit our website, filling it with content

Thursday 14 January 2010

Coming attractions

I'm working on putting together a couple of web pages for the Research Data Management programme here at the UK Data Archive. When I get the software installed and the permissions to edit granted it will emerge, quietly at first and then with a louder presence when it's respectable for public viewing. I have drafted text for the front page that outlines what the project is about within the context of archiving and secondary use. It's intended to be brief yet cover the immediate and long-term aims and tries to be accessible and open in tone - authoritative and technical where needed, but more conversational and informal where possible.

Also, look out for a twitter account and thinking aloud now, possibly a podcast might be a good idea.

Wednesday 13 January 2010

Link dump: sources on data managment

Australian National Data Service: Data Management Planning http://ands.org.au/resource/data-management-planning.html

Digital Curation Blog http://digitalcuration.blogspot.com/

Digital Curation Centre: Data Management Plan Content Checklist http://www.dcc.ac.uk/docs/templates/DMP_checklist.pdf

Digital Curation Centre: Digital Curation Manual http://www.dcc.ac.uk/resource/curation-manual/

Digital Curation Centre: Research Data Management Forum http://www.dcc.ac.uk/data-forum/

Digital Curation Centre: The DCCC Curation Lifecycle Model http://www.dcc.ac.uk/docs/publications/DCCLifecycle.pdf

ICPSR: Guide to Social Science Data Preperation and Archiving http://www.icpsr.umich.edu/files/ICPSR/access/dataprep.pdf

JISC: Managing Research Data http://www.jisc.ac.uk/whatwedo/programmes/mrd.aspx

MIT: Data Management and Publishing http://libraries.mit.edu/guides/subjects/data-management/

MIT: Managing Research Data 101 http://libraries.mit.edu/guides/subjects/data-management/Managing_Research_Data_101_IAP_2010.pdf

National Institutes of Health: Data Sharing http://grants.nih.gov/grants/policy/data_sharing/data_sharing_workbook.pdf

National Institutes of Health: Examples of Data Sharing Plans http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm#ex

RELU: Data Management plan http://www.data-archive.ac.uk/relu/plan.asp

Research Information Network and Digital Curation Centre Research Data Management Forum http://data-forum.blogspot.com/

Research Information Network: Data Management and Curation http://www.rin.ac.uk/our-work/data-management-and-curation

The Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC): Best Practices for Preparing Environmental Data Sets to Share and Archive http://daac.ornl.gov/PI/bestprac.html

UK Data Archive: Manage and Share Data http://www.data-archive.ac.uk/sharing/sharing.asp

UKOLN: Dealing with Data: Roles, Rights, Responsibilities and Relationships http://www.jisc.ac.uk/media/documents/programmes/digitalrepositories/dealing_with_data_report-final.pdf

University of Edinburgh: Research data management guidance http://www.ed.ac.uk/schools-departments/information-services/services/research-support/data-library/research-data-mgmt

Tuesday 12 January 2010

Day two: reviewing what's out there.

New jobs are always about reading up. You think you have an idea about the position, but when you finally sit at the desk you know there's so much more to learn. So, I have been searching for existing guides and practices on data management.

My initial main duties are to begin setting up a website for this project, and begin preparing an interim report. So, I'm going to read and reflect as much as possible on these existing practices and if possible come up with a sort of lit review. More for myself than anything.

Monday 11 January 2010

First day

Today is my first day in a new position. I am Research Data Management Senior Officer in the UK Data Archive working on a JISC supported project on research data management for ESRC data-rich investments. This involves working with ESRC Research Centres and programmes to assess their data management practices in social science research; implement and help develop effective data management planning and increase capacity through support and training. The project is due to run from now until the end of March, 2011.

My background is having worked the UK Data Archive for a couple of years. My former position was in the data processing section. Essentially, this was the link between depositors of data and secondary users where I was responsible for the validation, conversion, enrichment, and preparation for preservation and secondary use of social science data, accompanying documentation, and metadata. I worked almost exclusively with qualitative data, but essentially there was little difference between quantitative and qualitative processing, other than the length of time qualitative datasets took to read. In addition, I also presented ESDS workshops, assisted with the constructing of teaching material and writing promotional articles based on enhanced data.