OpenSubtitles - train [dataset]

Hudson, G Thomas (Durham Univesity)

Actions

Collections

This file is in the following collections:

MuLD: The Multitask Long Document Benchmark [dataset]

OpenSubtitles - train [dataset] Open Access

MuLD (Multitask Long Document Benchmark) is a set of 6 NLP tasks where the inputs consist of at least 10,000 words. The benchmark covers a wide variety of task types including translation, summarization, question answering, and classification. Additionally there is a range of output lengths from a single word classification label all the way up to an output longer than the input text.

Descriptions

Resource type: Dataset
Contributors: Data collector: Hudson, G Thomas ¹

¹ Durham Univesity
Funder
Research methods
Other description
Keyword: nlp
multitask
long document
Subject
Location
Language
Cited in
Identifier: ark:/32150/r2x920fw88t
Rights: MIT Licence (MIT)
Publisher: Durham University
Date Created

File Details

Depositor: G.T. Hudson
Date Uploaded: 26 April 2022, 15:04:54
Date Modified: 3 May 2022, 13:05:24
Audit Status: Audits have not yet been run on this file.
Related Files: hotpot_annotated_train.json; narrativeqa_train.json.bz2; vlsp_test.json.bz2; style_change_validation.json.bz2; hotpot_annotated_valid.json; narrativeqa_test.json.bz2; style_change_train.json.bz2; narrativeqa_validation.json.bz2; character_id_test.json.bz2; style_change_test.json.bz2; character_id_validation.json.bz2; opensubtitles_test.json.bz2; character_id_train.json.bz2
Characterization: File format: x-bzip2 (bzip2 compressed data, block size = 900k, BZ2, Bzip2)

Mime type: application/x-bzip2

File size: 423341613

Last modified: 2022:04:26 16:12:39+01:00

Filename: opensubtitles_train.json.bz2

Original checksum: 5ffb30163a8ba21757b837a6a85e19a3

Activity of users you follow
User Activity	Date
User N. Syrotiuk has updated OpenSubtitles - train [dataset]	about 4 years ago
User G.T. Hudson has updated OpenSubtitles - train [dataset]	about 4 years ago
User G.T. Hudson has updated opensubtitles_train.json.bz2 [dataset]	about 4 years ago
User G.T. Hudson has deposited opensubtitles_train.json.bz2	about 4 years ago