Web Content Display Web Content Display

Project Project

Put all Projects here.

Bluetooth based Attendance system is the new era for taking a paperless attendance both in classroom and in a company. Making changes in traditional, time consuming and hectic attendance system with the new technological advancement called BLUETOOTH. We are presenting a paperless and flexible way of taking attendance. Not only Bluetooth this is also using OCR i.e. Optical Character Recognition which helps the attendance taker to authenticate the attendance meaning he has to draw his signature on the screen so as to mark his attendance.

This system was developed keeping in mind the how attendance in schools and organizations is taken. They uses the traditional form to do the same. As with the technological advancememts this can easily be automated with the help of the technology called "Bluetooth" and OCR i.e.
The objective is simple yet productive, if developed in full-fledged model could solve all the issues like the authentcaion of the attendance and help a student/employee to check his attendance by directly connecting with the main server or can plan his holidays according to his attendance.

The most hectic to take attendance of around hundred's students or of around thousand's of employee which makes it a time consuming process. Back from late 90's the attendance were used to take on paper and then it was cross checked by remembering each of the students which obvious was a hectic process but also includes the disadvantages of making a false attendance of each others. This generation is a technological oriented generation. People are becoming lazy so they want everything automated which will save time, money and man power.   
Bluetooth Attendance System is software which automatically locks the attendance employees who all are available in the office or not. First it locks the attendance and then it saves data in to the database server Handwriting Recognition is developed to recognize hand written letter and characters. Its engine drive's from the Java NN Framework. Basically, this project combines both the features. Therefore, we develop the project on "BLUETOOTH ATTENDANCE WITH OCR SIGNATURE". When a person enters his office, then he connects to the office attendance database through the Bluetooth on his Mobile. Therefore, he made his attendance at the time he enters the office no matter where he is present in the office. But, due to the lack of security, whether he has made his attendance or not. Hence, we introduce the OCR signature concept to confirm his presence. This project thus defined as  
"The Person Marks his attendance using Bluetooth and confirms his Attendance using OCR Signature."
This system tried to implement a system which overcomes the limitations of the existing approach. Taking the attendance on mobile phones using Bluetooth instead of traditional approach is one step forward to sustainable development. Doing the same work on mobile phones not only saves our resources but also enables the user to get easy and interactive access to the attendance records of student. It makes an application that can help the teacher to take attendance of the students through their own mobile device. What could me more interesting for that!!System is using Optical Character Recognition for authentication of student signature to avoid proxy.
The technology used in this system is the use of the regular Bluetooth with the Optical Character recognition technology.  These technologies are easily available in market.


Bluetooth is a way to connect the devices wirelessly with help of the signals. A standard wire-replacement communications protocol primarily designed for low-power consumption, with a short range based on low-cost transceiver microchips in each device as in Figure 1.
 
            Figure 1.1 Bluetooth connect to server
Because the devices use a radio (broadcast) communications system, they do not have to be in visual line of sight of each other, however a quasi optical wireless path must be viable. Range is power-class-dependent, but effective ranges vary in practice; see the table on the right.
 
             Figure1.2  OCR Recognization
 Coming to the Optical Character Recognition is a new technology that helps to recognize the character to authenticate it so as to check its genuineness Figure 2. It checks the written text whether is original or not. OCR is a field of research in pattern recognition and computer vision. Optical character recognition (OCR) is the mechanical or electronic conversion of images of typewritten or printed text into machine-encoded text. It is widely used as a form of data entry and it uses the concept of artificial intelligence.

1.2  THESIS AND LITERATURE SURVEY:
The idea to this system was established by a research paper entitle "BLUETOOTH BASED ATTENDANCE SYSTEM USING RFID", in that they described about the usage of the Bluetooth based attendance system with the RFID technology. The Radio Frequency Identification Device helps a student to verify the attendance using a card given to him by the authority.
The RFID was a common device and anybody can mark the attendance of the other person. Due to lack of authentication it leads to concurrency. The concept of the OCR helps a attendance taker to authenticate the attendance and reduce the problem of concurrency. The System helps a person to be visible on the Bluetooth of the attendance taker and he/she has to go to mark his attendance or authenticate his attendance with the help of the OCR.
The idea behind this system is the automation. Automation is the process of making the system works automatically without the use of manpower. The Bluetooth based attendance system helps a teacher to scan all the nearby Bluetooth of students with his system. The system will check the MAC address of the Bluetooth and check if it is already registered or not. If it is registered then the system will automatically mark him present. But this is only half of the process the next step is to make it authenticated. This will be done by student by making the character on the screen which will be given at time of student registration and will help him to authenticate it.
 

WHAT WAS INFERED?
The existing system available in the industry includes the BLUETOOTH BASED SYSTEM either using RFID or the BIOMETRIC system, which are either most commonly used and have defaults or they are costly when the cost factor comes in and these system are less flexible. The RFID can be easily be hacked.
The system described in this scenario also consists of the Bluetooth but with a new concept i.e. the concept of the Optical Character Recognition. It checks the written text whether is original or not. OCR is a field of research in pattern recognition and computer vision. Optical character recognition (OCR) is the mechanical or electronic conversion of images of typewritten or printed text into machine-encoded text.
The existing system can only marks the attendance but cannot authenticate it. The students cannot check if it is mark absent or present. So he confirm and authenticate his attendance. The student data then is upload to the server and the students can check their attendance.
There exits the Bluetooth Attendance System and Optical Character Recognition Signature System individually.
Bluetooth Attendance System scans for all the Bluetooth devices and then take the hardware address from the scanned devices. Then it fetch person's information from the already stored table and then store data into database.
Optical Character Recognition Signature or now basically exits Handwritten Optical Character Recognition in which OCR machine reads machine printed/handwritten characters and tries to determine which character from a fixed set of the machine printed/handwritten characters is intended to be represented.   

Fundamentals of Bluetooth and OCR

2.1) Bluetooth:
Bluetooth is a wireless technology standard for exchanging data over short distances (using short-wavelength UHF radio waves in the ISM band from 2.4 to 2.485 GHz) from fixed and mobile devices, and building personal area networks (PANs). Invented by telecom vendor Ericsson in 1994, it was originally conceived as a wireless alternative to RS-232 data cables. It can connect several devices, overcoming problems of synchronization.
Bluetooth operates in the range of 2400–2483.5 MHz (including guard bands). This is in the globally unlicensed (but not unregulated) Industrial, Scientific and Medical (ISM) 2.4 GHz short-range radio frequency band. Bluetooth uses a radio technology called frequency-hopping spread spectrum. The transmitted data are divided into packets and each packet is transmitted on one of the 79 designated Bluetooth channels. Each channel has a bandwidth of 1 MHz. Bluetooth 4.0 uses 2 MHz spacing which allows for 40 channels. The first channel starts at 2402 MHz and continues up to 2480 MHz in 1 MHz steps. It usually performs 1600 hops per second, with Adaptive Frequency-Hopping (AFH) enabled.
 
      Figure2.1. Bluetooth
Optical character recognition (OCR) is the mechanical or electronic conversion of images of typewritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, whether passport documents, invoices, bank statement, receipts, business card, mail, or other documents. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation,text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems that have a high degree of recognition accuracy for most fonts are now common. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.
OCR is generally an "offline" process, which analyzes a static document. Handwriting movement analysis can be used as input to handwriting recognition. Instead of merely using the shapes of glyphs and words, this technique is able to capture motions, such as the order in which segments are drawn, the direction, and the pattern of putting the pen down and lifting it. This additional information can make the end-to-end process more accurate. This technology is also known as "on-line character recognition", "dynamic character recognition", "real-time character recognition", and "intelligent character recognition".
2.2) OCR:
Optical character recognition (OCR) is the mechanical or electronic conversion of images of typewritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, whether passport documents, invoices, bank statement, receipts, business card, mail, or other documents. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems that have a high degree of recognition accuracy for most fonts are now common. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.
Pre-processing
OCR software often "pre-processes" images to improve the chances of successful recognition. Techniques include:
De-skew – If the document was not aligned properly when scanned, it may need to be tilted a few degrees clockwise or counterclockwise in order to make lines of text perfectly horizontal or vertical.
Despeckle – remove positive and negative spots, smoothing edges
Binarization – Convert an image from color or greyscale to black-and-white (called a "binary image" because there are two colours). In some cases, this is necessary for the character recognition algorithm; in other cases, the algorithm performs better on the original image and so this step is skipped.
Line removal – Cleans up non-glyph boxes and lines
Layout analysis or "zoning" – Identifies columns, paragraphs, captions, etc. as distinct blocks. Especially important in multi-column layouts and tables.
Line and word detection – Establishes baseline for word and character shapes, separates words if necessary.
Script recognition – In multilingual documents, the script may change at the level of the words and hence, identification of the script is necessary, before the right OCR can be invoked to handle the specific script.
Character isolation or "segmentation" – For per-character OCR, multiple characters that are connected due to image artifacts must be separated; single characters that are broken into multiple pieces due to artifacts must be connected.
Normalize aspect ratio and scale
Segmentation of fixed-pitch fonts is accomplished relatively simply by aligning the image to a uniform grid based on where vertical grid lines will least often intersect black areas. For proportional fonts, more sophisticated techniques are needed because whitespace between letters can sometimes be greater than that between words, and vertical lines can intersect more than one character.
Character recognition
There are two basic types of core OCR algorithm, which may produce a ranked list of candidate characters.
Matrix matching involves comparing an image to a stored glyph on a pixel-by-pixel basis; it is also known as "pattern matching", "pattern recognition", or "image correlation". This relies on the input glyph being correctly isolated from the rest of the image, and on the stored glyph being in a similar font and at the same scale. This technique works best with typewritten text and does not work well when new fonts are encountered. This is the technique the early physical photocell-based OCR implemented, rather directly.
Feature extraction decomposes glyphs into "features" like lines, closed loops, line direction, and line intersections. These are compared with an abstract vector-like representation of a character, which might reduce to one or more glyph prototypes. General techniques of feature detection in computer vision are applicable to this type of OCR, which is commonly seen in "intelligent" handwriting recognition and indeed most modern OCR software.[9] Nearest neighbour classifiers such as the k-nearest neighbors algorithm are used to compare image features with stored glyph features and choose the nearest match.[15]
Software such as Cuneiform and Tesseract use a two-pass approach to character recognition. The second pass is known as "adaptive recognition" and uses the letter shapes recognized with high confidence on the first pass to recognize better the remaining letters on the second pass. This is advantageous for unusual fonts or low-quality scans where the font is distorted (e.g. blurred or faded).
Post-processing
OCR accuracy can be increased if the output is constrained by a lexicon – a list of words that are allowed to occur in a document. This might be, for example, all the words in the English language, or a more technical lexicon for a specific field. This technique can be problematic if the document contains words not in the lexicon, like proper nouns. Tesseract uses its dictionary to influence the character segmentation step, for improved accuracy.
The output stream may be a plain text stream or file of characters, but more sophisticated OCR systems can preserve the original layout of the page and produce, for example, an annotated PDF that includes both the original image of the page and a searchable textual representation.
"Near-neighbor analysis" can make use of co-occurrence frequencies to correct errors, by

Fundamentals of Bluetooth and OCR

2.1) Bluetooth:
Bluetooth is a wireless technology standard for exchanging data over short distances (using short-wavelength UHF radio waves in the ISM band from 2.4 to 2.485 GHz) from fixed and mobile devices, and building personal area networks (PANs). Invented by telecom vendor Ericsson in 1994, it was originally conceived as a wireless alternative to RS-232 data cables. It can connect several devices, overcoming problems of synchronization.
Bluetooth operates in the range of 2400–2483.5 MHz (including guard bands). This is in the globally unlicensed (but not unregulated) Industrial, Scientific and Medical (ISM) 2.4 GHz short-range radio frequency band. Bluetooth uses a radio technology called frequency-hopping spread spectrum. The transmitted data are divided into packets and each packet is transmitted on one of the 79 designated Bluetooth channels. Each channel has a bandwidth of 1 MHz. Bluetooth 4.0 uses 2 MHz spacing which allows for 40 channels. The first channel starts at 2402 MHz and continues up to 2480 MHz in 1 MHz steps. It usually performs 1600 hops per second, with Adaptive Frequency-Hopping (AFH) enabled.
 
      Figure2.1. Bluetooth
Optical character recognition (OCR) is the mechanical or electronic conversion of images of typewritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, whether passport documents, invoices, bank statement, receipts, business card, mail, or other documents. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation,text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems that have a high degree of recognition accuracy for most fonts are now common. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.
OCR is generally an "offline" process, which analyzes a static document. Handwriting movement analysis can be used as input to handwriting recognition. Instead of merely using the shapes of glyphs and words, this technique is able to capture motions, such as the order in which segments are drawn, the direction, and the pattern of putting the pen down and lifting it. This additional information can make the end-to-end process more accurate. This technology is also known as "on-line character recognition", "dynamic character recognition", "real-time character recognition", and "intelligent character recognition".
2.2) OCR:
Optical character recognition (OCR) is the mechanical or electronic conversion of images of typewritten or printed text into machine-encoded text. It is widely used as a form of data entry from printed paper data records, whether passport documents, invoices, bank statement, receipts, business card, mail, or other documents. It is a common method of digitizing printed texts so that it can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems that have a high degree of recognition accuracy for most fonts are now common. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.
Pre-processing
OCR software often "pre-processes" images to improve the chances of successful recognition. Techniques include:
De-skew – If the document was not aligned properly when scanned, it may need to be tilted a few degrees clockwise or counterclockwise in order to make lines of text perfectly horizontal or vertical.
Despeckle – remove positive and negative spots, smoothing edges
Binarization – Convert an image from color or greyscale to black-and-white (called a "binary image" because there are two colours). In some cases, this is necessary for the character recognition algorithm; in other cases, the algorithm performs better on the original image and so this step is skipped.
Line removal – Cleans up non-glyph boxes and lines
Layout analysis or "zoning" – Identifies columns, paragraphs, captions, etc. as distinct blocks. Especially important in multi-column layouts and tables.
Line and word detection – Establishes baseline for word and character shapes, separates words if necessary.
Script recognition – In multilingual documents, the script may change at the level of the words and hence, identification of the script is necessary, before the right OCR can be invoked to handle the specific script.
Character isolation or "segmentation" – For per-character OCR, multiple characters that are connected due to image artifacts must be separated; single characters that are broken into multiple pieces due to artifacts must be connected.
Normalize aspect ratio and scale
Segmentation of fixed-pitch fonts is accomplished relatively simply by aligning the image to a uniform grid based on where vertical grid lines will least often intersect black areas. For proportional fonts, more sophisticated techniques are needed because whitespace between letters can sometimes be greater than that between words, and vertical lines can intersect more than one character.
Character recognition
There are two basic types of core OCR algorithm, which may produce a ranked list of candidate characters.
Matrix matching involves comparing an image to a stored glyph on a pixel-by-pixel basis; it is also known as "pattern matching", "pattern recognition", or "image correlation". This relies on the input glyph being correctly isolated from the rest of the image, and on the stored glyph being in a similar font and at the same scale. This technique works best with typewritten text and does not work well when new fonts are encountered. This is the technique the early physical photocell-based OCR implemented, rather directly.
Feature extraction decomposes glyphs into "features" like lines, closed loops, line direction, and line intersections. These are compared with an abstract vector-like representation of a character, which might reduce to one or more glyph prototypes. General techniques of feature detection in computer vision are applicable to this type of OCR, which is commonly seen in "intelligent" handwriting recognition and indeed most modern OCR software.[9] Nearest neighbour classifiers such as the k-nearest neighbors algorithm are used to compare image features with stored glyph features and choose the nearest match.[15]
Software such as Cuneiform and Tesseract use a two-pass approach to character recognition. The second pass is known as "adaptive recognition" and uses the letter shapes recognized with high confidence on the first pass to recognize better the remaining letters on the second pass. This is advantageous for unusual fonts or low-quality scans where the font is distorted (e.g. blurred or faded).
Post-processing
OCR accuracy can be increased if the output is constrained by a lexicon – a list of words that are allowed to occur in a document. This might be, for example, all the words in the English language, or a more technical lexicon for a specific field. This technique can be problematic if the document contains words not in the lexicon, like proper nouns. Tesseract uses its dictionary to influence the character segmentation step, for improved accuracy.
The output stream may be a plain text stream or file of characters, but more sophisticated OCR systems can preserve the original layout of the page and produce, for example, an annotated PDF that includes both the original image of the page and a searchable textual representation.
"Near-neighbor analysis" can make use of co-occurrence frequencies to correct errors, by