Summary of the paper

Title Identifying Paraphrases between Technical and Lay Corpora
Authors Louise Deléger and Pierre Zweigenbaum
Abstract In previous work, we presented a preliminary study to identify paraphrasesbetween technical and lay discourse types from medical corpora dedicated to theFrench language. In this paper, we test the hypothesis that the same kinds ofparaphrases as for French can be detected between English technical and laydiscourse types and report the adaptation of our method from French to English.Starting from the constitution of monolingual comparable corpora, we extracttwo kinds of paraphrases: paraphrases between nominalizations and verbalconstructions and paraphrases between neo-classical compounds andmodern-language phrases. We do this relying on morphological resources and aset of extraction rules we adapt from the original approach for French. Resultsshow that paraphrases could be identified with a rather good precision, andthat these types of paraphrase are relevant in the context of the oppositionbetween technical and lay discourse types. These observations are consistentwith the results obtained for French, which demonstrates the portability of theapproach as well as the similarity of the two languages as regards the use ofthose kinds of expressions in technical and lay discourse types.
Language Multilinguality
Topics Textual Entailment and Paraphrasing, Information Extraction, Information Retrieval, Multilinguality
Full paper Identifying Paraphrases between Technical and Lay Corpora
Bibtex @InProceedings{DELGER10.472,
  author = {Louise Deléger and Pierre Zweigenbaum},
  title = {Identifying Paraphrases between Technical and Lay Corpora},
  booktitle = {Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA