OpenSubtitles - test [dataset]

Hudson, G Thomas (Durham Univesity)

Actions

Collections

This file is in the following collections:

MuLD: The Multitask Long Document Benchmark [dataset]

OpenSubtitles - test [dataset] Open Access

MuLD (Multitask Long Document Benchmark) is a set of 6 NLP tasks where the inputs consist of at least 10,000 words. The benchmark covers a wide variety of task types including translation, summarization, question answering, and classification. Additionally there is a range of output lengths from a single word classification label all the way up to an output longer than the input text.

Descriptions

Resource type: Dataset
Contributors: Data collector: Hudson, G Thomas ¹

¹ Durham Univesity
Funder
Research methods
Other description
Keyword: nlp
multitask
long document
Subject
Location
Language
Cited in
Identifier: ark:/32150/r26w924b850
Rights: MIT Licence (MIT)
Publisher: Durham University
Date Created

File Details

Depositor: G.T. Hudson
Date Uploaded: 26 April 2022, 15:04:20
Date Modified: 2 November 2022, 10:11:42
Audit Status: Audits have not yet been run on this file.
Related Files: hotpot_annotated_train.json; narrativeqa_train.json.bz2; vlsp_test.json.bz2; style_change_validation.json.bz2; hotpot_annotated_valid.json; narrativeqa_test.json.bz2; opensubtitles_train.json.bz2; style_change_train.json.bz2; narrativeqa_validation.json.bz2; character_id_test.json.bz2; style_change_test.json.bz2; character_id_validation.json.bz2; character_id_train.json.bz2
Characterization: File format: x-bzip2 (bzip2 compressed data, block size = 900k, BZ2, Bzip2)

Mime type: application/x-bzip2

File size: 53319550

Last modified: 2022:04:26 16:11:11+01:00

Filename: opensubtitles_test.json.bz2

Original checksum: 9c31455051f46ad887a95c777d9d61d5

Activity of users you follow
User Activity	Date
User N. Syrotiuk has updated OpenSubtitles - test [dataset]	over 3 years ago