تاریخ جمعآوری: 2018-01-01
گزینههای پاسخ: The hardest thing to understand about other people's data is often selection criteria that are hard to express., In my field incomplete capture and sample bias are inherent. Reserachers must use the proper design and estimator to minimize these biases. They must report capture/detection probabilities and how thesse were estimated along with the other metadata., ecology data is highly variable and small differences in collection have huge impacts on data meaning. it's a difficult thing to standardize!, License assignment were easy, clear, and respected, all data collection and processing by others was done with open source code that was distributed with the data., Associated cleaning/organization scripts were available, Clear information on how the data to be cited is available!, Conditions/licenses for reuse were explicit and there was documentation that data were collected in accordance with ethical standards., If data was integrated/federated across different domains where appropriate, Most of the data I have to manage are really very dirty and a mess. No metadata standard can do anything against that, and I wish I had a data dictionnary describing at least unit of measurement and types of the variables that are created in datasets. Most of the metadata standards do not even think to that issue., if I can vizualize the data and its documentation, I knew where it came from and for what purpose it was collected., there is sufficient identifiers methods, If there are questions about the data (for example, that did not seem important at the time), I can ask the originating lab directly about their methods., I did not reuse data, so I would rather have answered "N/A" when available, I use data as they are provided on the microscopic slides, For some fields (i.e. experimental chemistry) the "data sets" are rather small, and none of the questions are really applicable. To elaborate: it takes *a lot* of effort to obtain what amounts to a single data point, and that single result is extremely meaningful., I am not sure, The paper the data were published in also published their data processing tools (i.e. R or python scripts)., a full record on how other use the data.