An Unbalanced Data Classification Model Using Hybrid Sampling Technique for Fraud Detection View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2007

AUTHORS

T. Maruthi Padmaja , Narendra Dhulipalla , P. Radha Krishna , Raju S. Bapi , A. Laha

ABSTRACT

Detecting fraud is a challenging task as fraud coexists with the latest in technology. The problem to detect the fraud is that the dataset is unbalanced where non-fraudulent class heavily dominates the fraudulent class. In this work, we considered the fraud detection problem as unbalanced data classification problem and proposed a model based on hybrid sampling technique, which is a combination of random under-sampling and over-sampling using SMOTE. Here, SMOTE is used to widen the data region corresponding to minority samples and random under-sampling of majority class is used for balancing the class distribution. The value difference metric (VDM) is used as distance measure while doing SMOTE. We conducted the experiments with classifiers namely k-NN, Radial Basis Function networks, C4.5 and Naive Bayes with varied levels of SMOTE on insurance fraud dataset. For evaluating the learned classifiers, we have chosen fraud catching rate, non-fraud catching rate in addition to overall accuracy of the classifier as performance measures. Results indicate that our approach produces high predictions against fraud and non-fraud classes. More... »

PAGES

341-348

References to SciGraph publications

Book

TITLE

Pattern Recognition and Machine Intelligence

ISBN

978-3-540-77045-9

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-77046-6_43

DOI

http://dx.doi.org/10.1007/978-3-540-77046-6_43

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1033627885


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Institute for Development and Research in Banking Technology", 
          "id": "https://www.grid.ac/institutes/grid.473631.4", 
          "name": [
            "Institute for Development and Research in Banking Technology (IDRBT), Hyderabad, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Padmaja", 
        "givenName": "T. Maruthi", 
        "id": "sg:person.016551073747.31", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016551073747.31"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute for Development and Research in Banking Technology", 
          "id": "https://www.grid.ac/institutes/grid.473631.4", 
          "name": [
            "Institute for Development and Research in Banking Technology (IDRBT), Hyderabad, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Dhulipalla", 
        "givenName": "Narendra", 
        "id": "sg:person.013165006053.86", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013165006053.86"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute for Development and Research in Banking Technology", 
          "id": "https://www.grid.ac/institutes/grid.473631.4", 
          "name": [
            "Institute for Development and Research in Banking Technology (IDRBT), Hyderabad, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Krishna", 
        "givenName": "P. Radha", 
        "id": "sg:person.010423171637.27", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010423171637.27"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Hyderabad", 
          "id": "https://www.grid.ac/institutes/grid.18048.35", 
          "name": [
            "Dept of Computer and Information Sciences, University of Hyderabad, \u2013 500046, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Bapi", 
        "givenName": "Raju S.", 
        "id": "sg:person.01367446263.09", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01367446263.09"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute for Development and Research in Banking Technology", 
          "id": "https://www.grid.ac/institutes/grid.473631.4", 
          "name": [
            "Institute for Development and Research in Banking Technology (IDRBT), Hyderabad, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Laha", 
        "givenName": "A.", 
        "id": "sg:person.011633466625.33", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011633466625.33"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1111/j.0824-7935.2004.t01-1-00228.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014940366"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/312129.312220", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021956833"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1007452223027", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025850532", 
          "https://doi.org/10.1023/a:1007452223027"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/1007730.1007738", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034710737"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0950-7051(00)00050-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053241289"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/5254.809570", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061186312"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/discex.2000.821515", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095683204"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1613/jair.346", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1105538442"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1613/jair.953", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1105579550"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2007", 
    "datePublishedReg": "2007-01-01", 
    "description": "Detecting fraud is a challenging task as fraud coexists with the latest in technology. The problem to detect the fraud is that the dataset is unbalanced where non-fraudulent class heavily dominates the fraudulent class. In this work, we considered the fraud detection problem as unbalanced data classification problem and proposed a model based on hybrid sampling technique, which is a combination of random under-sampling and over-sampling using SMOTE. Here, SMOTE is used to widen the data region corresponding to minority samples and random under-sampling of majority class is used for balancing the class distribution. The value difference metric (VDM) is used as distance measure while doing SMOTE. We conducted the experiments with classifiers namely k-NN, Radial Basis Function networks, C4.5 and Naive Bayes with varied levels of SMOTE on insurance fraud dataset. For evaluating the learned classifiers, we have chosen fraud catching rate, non-fraud catching rate in addition to overall accuracy of the classifier as performance measures. Results indicate that our approach produces high predictions against fraud and non-fraud classes.", 
    "editor": [
      {
        "familyName": "Ghosh", 
        "givenName": "Ashish", 
        "type": "Person"
      }, 
      {
        "familyName": "De", 
        "givenName": "Rajat K.", 
        "type": "Person"
      }, 
      {
        "familyName": "Pal", 
        "givenName": "Sankar K.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-77046-6_43", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-77045-9"
      ], 
      "name": "Pattern Recognition and Machine Intelligence", 
      "type": "Book"
    }, 
    "name": "An Unbalanced Data Classification Model Using Hybrid Sampling Technique for Fraud Detection", 
    "pagination": "341-348", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-77046-6_43"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "a0310983a9c4d07c6a6947f95c57a72b018d25db404eee5597e179110fd6759c"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1033627885"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-77046-6_43", 
      "https://app.dimensions.ai/details/publication/pub.1033627885"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T05:44", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000347_0000000347/records_89801_00000001.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F978-3-540-77046-6_43"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-77046-6_43'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-77046-6_43'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-77046-6_43'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-77046-6_43'


 

This table displays all metadata directly associated to this object as RDF triples.

133 TRIPLES      23 PREDICATES      36 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-77046-6_43 schema:about anzsrc-for:08
2 anzsrc-for:0806
3 schema:author Nc850050e23f14cf7839d3a00a5520102
4 schema:citation sg:pub.10.1023/a:1007452223027
5 https://doi.org/10.1016/s0950-7051(00)00050-2
6 https://doi.org/10.1109/5254.809570
7 https://doi.org/10.1109/discex.2000.821515
8 https://doi.org/10.1111/j.0824-7935.2004.t01-1-00228.x
9 https://doi.org/10.1145/1007730.1007738
10 https://doi.org/10.1145/312129.312220
11 https://doi.org/10.1613/jair.346
12 https://doi.org/10.1613/jair.953
13 schema:datePublished 2007
14 schema:datePublishedReg 2007-01-01
15 schema:description Detecting fraud is a challenging task as fraud coexists with the latest in technology. The problem to detect the fraud is that the dataset is unbalanced where non-fraudulent class heavily dominates the fraudulent class. In this work, we considered the fraud detection problem as unbalanced data classification problem and proposed a model based on hybrid sampling technique, which is a combination of random under-sampling and over-sampling using SMOTE. Here, SMOTE is used to widen the data region corresponding to minority samples and random under-sampling of majority class is used for balancing the class distribution. The value difference metric (VDM) is used as distance measure while doing SMOTE. We conducted the experiments with classifiers namely k-NN, Radial Basis Function networks, C4.5 and Naive Bayes with varied levels of SMOTE on insurance fraud dataset. For evaluating the learned classifiers, we have chosen fraud catching rate, non-fraud catching rate in addition to overall accuracy of the classifier as performance measures. Results indicate that our approach produces high predictions against fraud and non-fraud classes.
16 schema:editor N28a37e914597404fb8406534016608bf
17 schema:genre chapter
18 schema:inLanguage en
19 schema:isAccessibleForFree true
20 schema:isPartOf N2f2339d7b47a4d7586b855a641ff0db4
21 schema:name An Unbalanced Data Classification Model Using Hybrid Sampling Technique for Fraud Detection
22 schema:pagination 341-348
23 schema:productId N0212755f7aa04b7cb0cb1d417eda16c0
24 N378f74aa955e4c56ab17d4740c88372d
25 Nc3e6ee743593463ea9d0d970f9b85a73
26 schema:publisher N7d5b7112faf74d8e9b542d1c0000ed40
27 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033627885
28 https://doi.org/10.1007/978-3-540-77046-6_43
29 schema:sdDatePublished 2019-04-16T05:44
30 schema:sdLicense https://scigraph.springernature.com/explorer/license/
31 schema:sdPublisher N4c50cfeafacb4374a13fe92afde90cfd
32 schema:url https://link.springer.com/10.1007%2F978-3-540-77046-6_43
33 sgo:license sg:explorer/license/
34 sgo:sdDataset chapters
35 rdf:type schema:Chapter
36 N0212755f7aa04b7cb0cb1d417eda16c0 schema:name dimensions_id
37 schema:value pub.1033627885
38 rdf:type schema:PropertyValue
39 N02c0287a22af4acba6c5ef27d8a3290b schema:familyName De
40 schema:givenName Rajat K.
41 rdf:type schema:Person
42 N28a37e914597404fb8406534016608bf rdf:first N3f8b5777675a4074aa9d18f5b3253956
43 rdf:rest Nceb7ff0d589e43f9ad57e7c98c46bb87
44 N2a11bb33e2354a9c9e614e47f570a8f8 rdf:first sg:person.010423171637.27
45 rdf:rest Nbc487b298dff4dde81220e1da8794a26
46 N2a8e08673fcd4db0b254d28122e0a4ef rdf:first Nd811d1ab1b784f5fa43799bc6120990d
47 rdf:rest rdf:nil
48 N2dcc508bc01647949cfc5c39216925d8 rdf:first sg:person.011633466625.33
49 rdf:rest rdf:nil
50 N2f2339d7b47a4d7586b855a641ff0db4 schema:isbn 978-3-540-77045-9
51 schema:name Pattern Recognition and Machine Intelligence
52 rdf:type schema:Book
53 N378f74aa955e4c56ab17d4740c88372d schema:name doi
54 schema:value 10.1007/978-3-540-77046-6_43
55 rdf:type schema:PropertyValue
56 N3f8b5777675a4074aa9d18f5b3253956 schema:familyName Ghosh
57 schema:givenName Ashish
58 rdf:type schema:Person
59 N4c50cfeafacb4374a13fe92afde90cfd schema:name Springer Nature - SN SciGraph project
60 rdf:type schema:Organization
61 N75e7498e14b746dcb0b36d3f596462f3 rdf:first sg:person.013165006053.86
62 rdf:rest N2a11bb33e2354a9c9e614e47f570a8f8
63 N7d5b7112faf74d8e9b542d1c0000ed40 schema:location Berlin, Heidelberg
64 schema:name Springer Berlin Heidelberg
65 rdf:type schema:Organisation
66 Nbc487b298dff4dde81220e1da8794a26 rdf:first sg:person.01367446263.09
67 rdf:rest N2dcc508bc01647949cfc5c39216925d8
68 Nc3e6ee743593463ea9d0d970f9b85a73 schema:name readcube_id
69 schema:value a0310983a9c4d07c6a6947f95c57a72b018d25db404eee5597e179110fd6759c
70 rdf:type schema:PropertyValue
71 Nc850050e23f14cf7839d3a00a5520102 rdf:first sg:person.016551073747.31
72 rdf:rest N75e7498e14b746dcb0b36d3f596462f3
73 Nceb7ff0d589e43f9ad57e7c98c46bb87 rdf:first N02c0287a22af4acba6c5ef27d8a3290b
74 rdf:rest N2a8e08673fcd4db0b254d28122e0a4ef
75 Nd811d1ab1b784f5fa43799bc6120990d schema:familyName Pal
76 schema:givenName Sankar K.
77 rdf:type schema:Person
78 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
79 schema:name Information and Computing Sciences
80 rdf:type schema:DefinedTerm
81 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
82 schema:name Information Systems
83 rdf:type schema:DefinedTerm
84 sg:person.010423171637.27 schema:affiliation https://www.grid.ac/institutes/grid.473631.4
85 schema:familyName Krishna
86 schema:givenName P. Radha
87 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010423171637.27
88 rdf:type schema:Person
89 sg:person.011633466625.33 schema:affiliation https://www.grid.ac/institutes/grid.473631.4
90 schema:familyName Laha
91 schema:givenName A.
92 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011633466625.33
93 rdf:type schema:Person
94 sg:person.013165006053.86 schema:affiliation https://www.grid.ac/institutes/grid.473631.4
95 schema:familyName Dhulipalla
96 schema:givenName Narendra
97 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013165006053.86
98 rdf:type schema:Person
99 sg:person.01367446263.09 schema:affiliation https://www.grid.ac/institutes/grid.18048.35
100 schema:familyName Bapi
101 schema:givenName Raju S.
102 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01367446263.09
103 rdf:type schema:Person
104 sg:person.016551073747.31 schema:affiliation https://www.grid.ac/institutes/grid.473631.4
105 schema:familyName Padmaja
106 schema:givenName T. Maruthi
107 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016551073747.31
108 rdf:type schema:Person
109 sg:pub.10.1023/a:1007452223027 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025850532
110 https://doi.org/10.1023/a:1007452223027
111 rdf:type schema:CreativeWork
112 https://doi.org/10.1016/s0950-7051(00)00050-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053241289
113 rdf:type schema:CreativeWork
114 https://doi.org/10.1109/5254.809570 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061186312
115 rdf:type schema:CreativeWork
116 https://doi.org/10.1109/discex.2000.821515 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095683204
117 rdf:type schema:CreativeWork
118 https://doi.org/10.1111/j.0824-7935.2004.t01-1-00228.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1014940366
119 rdf:type schema:CreativeWork
120 https://doi.org/10.1145/1007730.1007738 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034710737
121 rdf:type schema:CreativeWork
122 https://doi.org/10.1145/312129.312220 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021956833
123 rdf:type schema:CreativeWork
124 https://doi.org/10.1613/jair.346 schema:sameAs https://app.dimensions.ai/details/publication/pub.1105538442
125 rdf:type schema:CreativeWork
126 https://doi.org/10.1613/jair.953 schema:sameAs https://app.dimensions.ai/details/publication/pub.1105579550
127 rdf:type schema:CreativeWork
128 https://www.grid.ac/institutes/grid.18048.35 schema:alternateName University of Hyderabad
129 schema:name Dept of Computer and Information Sciences, University of Hyderabad, – 500046, India
130 rdf:type schema:Organization
131 https://www.grid.ac/institutes/grid.473631.4 schema:alternateName Institute for Development and Research in Banking Technology
132 schema:name Institute for Development and Research in Banking Technology (IDRBT), Hyderabad, India
133 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...