Chinese text contains recognition errors Retrieval View Homepage


Ontology type: schema:MonetaryGrant     


Grant Info

YEARS

2003-2006

FUNDING AMOUNT

260000 CNY

ABSTRACT

This research document digitization process, after the Chinese text contains certain errors scanning recognition obtained full-text search theory and method. Modern society is developing rapidly to the information, networking, digital direction. China is also advancing towards this goal. Accumulated a large number of non-digitized books and other documentation in the historical development, the information contained in these documents a wealth of valuable information. To make these information play its due role in modern society, it is necessary to digitize paper documents, commonly used means of text recognition. For ease of use, but also need to establish a system to retrieve the digitized documents. In the recognition process will inevitably have some errors. Even after proof is still difficult to eliminate all the errors. And artificial proofread it takes a lot of manpower and material resources and time. In fact, because the human brain has a strong ability to understand and error correction capability, it contains some erroneous text can still be generally understood. However, errors in the text gave the full-text search has brought great difficulties. This is because the traditional information retrieval model the most basic starting point is that word index and match. When you need to match the word error, it will inevitably lead to the search fails. Thus, it is necessary to study very effective retrieval on the text containing the recognition of wrong More... »

URL

http://npd.nsfc.gov.cn/projectDetail.action?pid=60303005

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/2208", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "type": "DefinedTerm"
      }
    ], 
    "amount": {
      "currency": "CNY", 
      "type": "MonetaryAmount", 
      "value": "260000"
    }, 
    "description": "This research document digitization process, after the Chinese text contains certain errors scanning recognition obtained full-text search theory and method. Modern society is developing rapidly to the information, networking, digital direction. China is also advancing towards this goal. Accumulated a large number of non-digitized books and other documentation in the historical development, the information contained in these documents a wealth of valuable information. To make these information play its due role in modern society, it is necessary to digitize paper documents, commonly used means of text recognition. For ease of use, but also need to establish a system to retrieve the digitized documents. In the recognition process will inevitably have some errors. Even after proof is still difficult to eliminate all the errors. And artificial proofread it takes a lot of manpower and material resources and time. In fact, because the human brain has a strong ability to understand and error correction capability, it contains some erroneous text can still be generally understood. However, errors in the text gave the full-text search has brought great difficulties. This is because the traditional information retrieval model the most basic starting point is that word index and match. When you need to match the word error, it will inevitably lead to the search fails. Thus, it is necessary to study very effective retrieval on the text containing the recognition of wrong", 
    "endDate": "2006-12-30T00:00:00Z", 
    "funder": {
      "id": "https://www.grid.ac/institutes/grid.419696.5", 
      "type": "Organization"
    }, 
    "id": "sg:grant.4944396", 
    "identifier": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "4944396"
        ]
      }, 
      {
        "name": "nsfc_id", 
        "type": "PropertyValue", 
        "value": [
          "60303005"
        ]
      }
    ], 
    "inLanguage": [
      "zh"
    ], 
    "keywords": [
      "information", 
      "paper documents", 
      "China", 
      "time", 
      "document", 
      "proof", 
      "error correction capability", 
      "word index", 
      "lot", 
      "match", 
      "manpower", 
      "METHODS", 
      "human brain", 
      "historical development", 
      "goal", 
      "wealth", 
      "search", 
      "recognition", 
      "great difficulty", 
      "text", 
      "system", 
      "Chinese text", 
      "text recognition", 
      "use", 
      "non-digitized books", 
      "recognition errors Retrieval", 
      "full-text search theory", 
      "valuable information", 
      "strong ability", 
      "modern society", 
      "research document digitization process", 
      "fact", 
      "recognition process", 
      "other documentation", 
      "basic starting point", 
      "errors", 
      "erroneous text", 
      "traditional information retrieval models", 
      "certain errors", 
      "artificial proofread", 
      "digitized documents", 
      "effective retrieval", 
      "full-text search", 
      "means", 
      "due role", 
      "word errors", 
      "digital direction", 
      "material resources", 
      "networking", 
      "ease", 
      "large number"
    ], 
    "name": "Chinese text contains recognition errors Retrieval", 
    "recipient": [
      {
        "id": "https://www.grid.ac/institutes/grid.12527.33", 
        "type": "Organization"
      }, 
      {
        "affiliation": {
          "id": "https://www.grid.ac/institutes/grid.12527.33", 
          "name": "tsinghua university", 
          "type": "Organization"
        }, 
        "familyName": "Jin", 
        "givenName": "Yi Jiang", 
        "id": "sg:person.0671170055.85", 
        "type": "Person"
      }, 
      {
        "member": "sg:person.0671170055.85", 
        "roleName": "PI", 
        "type": "Role"
      }
    ], 
    "sameAs": [
      "https://app.dimensions.ai/details/grant/grant.4944396"
    ], 
    "sdDataset": "grants", 
    "sdDatePublished": "2019-03-07T12:42", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com.uberresearch.data.processor/core_data/20181219_192338/projects/base/nsfc_projects_3.xml.gz", 
    "startDate": "2003-12-31T00:00:00Z", 
    "type": "MonetaryGrant", 
    "url": "http://npd.nsfc.gov.cn/projectDetail.action?pid=60303005"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/grant.4944396'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/grant.4944396'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/grant.4944396'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/grant.4944396'


 

This table displays all metadata directly associated to this object as RDF triples.

95 TRIPLES      19 PREDICATES      73 URIs      65 LITERALS      5 BLANK NODES

Subject Predicate Object
1 sg:grant.4944396 schema:about anzsrc-for:2208
2 schema:amount N8f8f07bada834bc9ac033f9d1cdfb590
3 schema:description This research document digitization process, after the Chinese text contains certain errors scanning recognition obtained full-text search theory and method. Modern society is developing rapidly to the information, networking, digital direction. China is also advancing towards this goal. Accumulated a large number of non-digitized books and other documentation in the historical development, the information contained in these documents a wealth of valuable information. To make these information play its due role in modern society, it is necessary to digitize paper documents, commonly used means of text recognition. For ease of use, but also need to establish a system to retrieve the digitized documents. In the recognition process will inevitably have some errors. Even after proof is still difficult to eliminate all the errors. And artificial proofread it takes a lot of manpower and material resources and time. In fact, because the human brain has a strong ability to understand and error correction capability, it contains some erroneous text can still be generally understood. However, errors in the text gave the full-text search has brought great difficulties. This is because the traditional information retrieval model the most basic starting point is that word index and match. When you need to match the word error, it will inevitably lead to the search fails. Thus, it is necessary to study very effective retrieval on the text containing the recognition of wrong
4 schema:endDate 2006-12-30T00:00:00Z
5 schema:funder https://www.grid.ac/institutes/grid.419696.5
6 schema:identifier N387ca7a205524e3d85b6cce4554004c6
7 Nd5b7b636de0b4203943aa5f7eb2d3b43
8 schema:inLanguage zh
9 schema:keywords China
10 Chinese text
11 METHODS
12 artificial proofread
13 basic starting point
14 certain errors
15 digital direction
16 digitized documents
17 document
18 due role
19 ease
20 effective retrieval
21 erroneous text
22 error correction capability
23 errors
24 fact
25 full-text search
26 full-text search theory
27 goal
28 great difficulty
29 historical development
30 human brain
31 information
32 large number
33 lot
34 manpower
35 match
36 material resources
37 means
38 modern society
39 networking
40 non-digitized books
41 other documentation
42 paper documents
43 proof
44 recognition
45 recognition errors Retrieval
46 recognition process
47 research document digitization process
48 search
49 strong ability
50 system
51 text
52 text recognition
53 time
54 traditional information retrieval models
55 use
56 valuable information
57 wealth
58 word errors
59 word index
60 schema:name Chinese text contains recognition errors Retrieval
61 schema:recipient N275e36666ae04cc0ba22204b36a75b6d
62 sg:person.0671170055.85
63 https://www.grid.ac/institutes/grid.12527.33
64 schema:sameAs https://app.dimensions.ai/details/grant/grant.4944396
65 schema:sdDatePublished 2019-03-07T12:42
66 schema:sdLicense https://scigraph.springernature.com/explorer/license/
67 schema:sdPublisher N5fd91fddc6fe4cdea78d1b4f406d563a
68 schema:startDate 2003-12-31T00:00:00Z
69 schema:url http://npd.nsfc.gov.cn/projectDetail.action?pid=60303005
70 sgo:license sg:explorer/license/
71 sgo:sdDataset grants
72 rdf:type schema:MonetaryGrant
73 N275e36666ae04cc0ba22204b36a75b6d schema:member sg:person.0671170055.85
74 schema:roleName PI
75 rdf:type schema:Role
76 N387ca7a205524e3d85b6cce4554004c6 schema:name nsfc_id
77 schema:value 60303005
78 rdf:type schema:PropertyValue
79 N5fd91fddc6fe4cdea78d1b4f406d563a schema:name Springer Nature - SN SciGraph project
80 rdf:type schema:Organization
81 N8f8f07bada834bc9ac033f9d1cdfb590 schema:currency CNY
82 schema:value 260000
83 rdf:type schema:MonetaryAmount
84 Nd5b7b636de0b4203943aa5f7eb2d3b43 schema:name dimensions_id
85 schema:value 4944396
86 rdf:type schema:PropertyValue
87 anzsrc-for:2208 schema:inDefinedTermSet anzsrc-for:
88 rdf:type schema:DefinedTerm
89 sg:person.0671170055.85 schema:affiliation https://www.grid.ac/institutes/grid.12527.33
90 schema:familyName Jin
91 schema:givenName Yi Jiang
92 rdf:type schema:Person
93 https://www.grid.ac/institutes/grid.12527.33 schema:name tsinghua university
94 rdf:type schema:Organization
95 https://www.grid.ac/institutes/grid.419696.5 schema:Organization
 




Preview window. Press ESC to close (or click here)


...