Search This Blog

Saturday, August 22, 2009

مرحبا بكم في رمضان


مرحبا بكم في رمضان

welcome Ramadan. months full forgiveness and mercy. Love that always meet the longing of the soul unto God. this month I will submit everything to you my love. Not because of heaven or hell, but because you love ....

in this August 17 date 64 years ago, Ramadan is a silent witness the independence of Indonesia. Ramadan 5 years ago I can still see the girl that I love, and June 5 24 years ago, my birth date is

Ramadan is always so special for me

Thursday, August 20, 2009

Filter Stop List

stop list is a text data that is contained in a particular list. Stop list is a list of words omitted in the text mining. Stop list is an example subject, pronoun, conjunction, etc..

Data in the stop list is not needed in search of information and knowledge to the text data. example a sentence "I love you", the world "I" and "you" do not have meaning in the text because knowledge is certainly the word love. I said you and data including the stop list.

in a system of text data mining more detail a list it will stop the better. this is because the process of stemming the words, Tagging, and keyword search text data will be shorter. effectiveness in time and memory to be the main goal of making stop list.

Wednesday, August 19, 2009

Stemming the sentences

"Stemming the sentence" is a process of splitting the sentence into words - the words. In programming languages are usually known by the name "tokenism". Why the sentence should be parsed into words? this process so that the introduction of the text is running with the pitch and fast measure. With the introduction of a data per text in the word can be ascertained that there is no data that is not in the process.

problems that arise is how the word back in the Indonesian language. Of course there must be reprocess for word - the word is back. Or can only, be a re-word a word itself or may be double counted. That's up to the case faced by a system analyst.

What is "Pre processing" ?

"Preprocessed" is a data cleaning process to be more easily processed by the system. Preprocessed is useful in order to extract the data before and taken his knowledge or keyword does not already have dirty data or data that are not needed. Example of dirty data such as:
  • Digit
  • Punctuation
  • Bullets

Based on the research I do, numbers, punctuation marks, and bullets do not have a significant influence on a text data. This is because data in the text will eventually be in the ranking while the data is dirty data can not stand on its own.

Because it is the numbers, punctuation marks, and bullets need to be erased before the process of text mining that others do.

this is the scheme of prepossessing

Tuesday, August 18, 2009

The Text Mining

Text mining is a system to mine data as text. The mine here in the analogy can be like someone who is a gold mine. When the mine equipment was then a lot of clay, but we separated them and find where the existence of the gold is.

Same as text mining, we search for knowledge or important data from the data that a lot of text. The resulting data is a keyword or data that indicates the content of text data before.

Here is a text mining scheme:


to more clearly waiting next article

Thursday, August 13, 2009

Searching Verse - Ayat Al - Qur'an Content Based on the belief Problems In Using Text Mining (My Final Project Prposal)

Internet world is very much sought by the community. Many of them are looking for solutions, tutorials, discussion, or browse through the discussion on the Internet. One is looking for the solution of the problem belief.

Verse - verse in Al - Qur'an is not organized based on a problem, so does the scatter, so it takes a long time to search verses - verses that are required for problem-solving faith.

So with the end of this project will help to find a verse reference and translation Al - Qur'an problems in belief, and are easily accessible because of web-based so it can be used by anyone

Keywords: Internet, Text Mining, Al - Qur'an, belief

Background authors take the title of this final project is difficulty implementing the Al-Qur'an in life - especially in the day by solving some of the people, as the form of Al-Qur 'an that conventional hard to learn.

Verse - verse in "Al - Qur'an" is not organized based on a problem, so does the scatter, so it takes a long time to search verses - verses that are required for problem-solving faith.
Search process clause - paragraph conventional "Al - Qur'an" way or with the Al - Qur'an during which digital is on the Internet (eg www. QuranDigital.Com-Al) is not enough to help, if the results we want is a paragraph -- paragraph in accordance with the particular problems that we face. With such a system is required to identify, find, and group problem diinputkan by the user. So that the system can display the verse - verse "Al - Qur'an" as a reference and solutions.

With the above problems, then you can use a process of introducing the text in a Text Mining. With the process so that problems in the user input performed with several methods such as parsing, stemming, and morphing To be able to recognize problems.
By the end of this project is expected to help find a verse reference and translation "Al - Qur'an" problems in belief, and are easily accessible because of web-based so it can be used by anyone.

Authors interested in writing and discuss the text mining al - Qur'an. Especially in extracting information about the problem of faith in verse - verse it. With the goal of producing a web application that can:
  1. 1. Search for verse - verse al qur 'an issue of translation and belief
  2. 2. Providing solutions to the problems of displaying text - verse and translated it.
The problem was that in the end of this project are:

• How to search for keywords from the problems that have?
• How can differentiate the problem, including a belief or not?
• How to related a problem with the belief paragraph - paragraph "Al - Qur'an" and translated?

TEXT MINING
text mining is the process of mine text data with a data source is usually from the document and find the goal is the word - a word that represents a document so that it can be done in the analysis relationship document [1] ..
Text data will be processed into numeric data that can be process further. So in terms of text mining is the data pre processing - that is the precursor that is applied to the text data that aims to generate numeric data [3].


On pre prosesing have conducted several phases, namely [4]:
  • Removal and markup format document that is used if a text is not pure then this step is taken. Because the text document that usually we see a non-text formats such as html, pdf or in the form of a word. Format this format requires that a text has additional elements to be able to generate the view that our friendly eyes. Information is omitted because they do not need and do not reflect the contents of a text document
  • Remove punctuation punctuation and numbers is also considered not important, because it happens in the research that I do not see the relevance words, sentences or the like, so the word is considered independently.
  • Changing from capital letters to all lowercase.
  • Parsing and Stemming. decomposition into the form of single words and words into the form of the establishment of the foundation, so that words that have a basic form of words will be the same.
  • Weighting -> Starting with the calculation of the number of words in each document, which will then be calculated using the scheme weighting process desired.

Science belief according to Al - Digital Koran is a faith and confidence of Allah, which are grouped in a peaceful faith that we know [5] [6] [9]:

1. Faith in God

2. Faith to Angel of God

3. The faith of Scripture

4. The faith of the Messenger of Allah

5. Faith to the Day of Resurrection

6. The faith of destiny

Here is a system architecture design project end. Where users do request to the web server, web server and computing to explore with the authentication keyword database. Then the web server to provide the results to the user.

Process on the end of this project are:
  1. The process is there is a prepossessing method to remove the punctuation mark, number, symbol, mark up, and make all the words with the uniform makes them all small letters.
  2. Stemming the method is to split the sentence into words - words that stand alone. Because the word in the text mining in the sentence has no relevance.
  3. Stemming the words of this process is cut so that the leading particle and the word - a word only in the form of basic words. So weighting valid.
  4. Weighting or group weighting process word - a word basis.
  5. Search for keyword keyword is the word - a word which is more than 2 and is a keyword in the belief




Conclusion

Based on the results of experiments and analysis carried out in the Project End this can be concluded that:

1) The system has been created as able to meet the needs of search applications verse - verse of the Holy "Qur'an"-related problems by using the belief Text Mining.
2) All the keywords that can be identified particle all except the word "kingdom" which is the "king".
3) This system does not save all the data "Al - Qur'an" on a problem, but taken at least 70% of the "Al - Qur'an". Data "Al - Qur'an" is considered to be taken to represent the other.


2. Advice

Here are some suggestions for the development of the system in the future, based on the results of the design, implementation, and testing that was done.:
1) Complete data paragraph "Al - Qur'an", of course, must at the same time with a larger database from "MYSQL".
2) Complete the data keyword belief problems
3) improve the stemming algorithm, that is the case in the word "kingdom". With:
a. First check in the database, including whether or not the basic word. If you are including the words do not have the basic process through stemming.
b. Perfect combination of particle algorithm.
4) The making of the application form of "WAP" or desktop applications, so that the web service can be accessed by the application of the various platforms.
5) The making of the feature entry words and basic word of particle with administrative authentication credentials.

Just Joke



a day, john was near death. with a pain he said to his wife, Alice.

John: My dear, after I die I want you to marry Bob. I will be quiet in heaven if you become his wife.

Alice: Bob enemies when you are young, why do you want me married to Bob?

John: Thus I want him to feel that I Suffering natural for this since I married You.

Ensuring Security Cryptography

In this world there are some parties who want to take information from chipertext already in cryptography. party called kriptanalis.
Kriptoanalisis also be defined as art or science to solve ciphertext become plaintexts using the midst security a cryptography system. This makes Kriptoanalisis labeled as illegal ways to translate ciphertext. Based on the activities of attackers, the attack that occurred can be divided into two types, namely:

a) passive attacks, where attack is attacker only monitors the channel communication. Passive attacker only threaten confidentiality of data.

b) active attacks, where attack is
attacker trying to remove, add, or in a way that another change in the transmission path of communication. Active attacker will threaten data integrity and authentication, and confidentiality.


Types of Seizures
There are several types of attacks that can be done by kriptoanalis, with the assumption that kriptoanalis had cryptography algorithm used in the system that will be attacked, namely:

1. Ciphertext Only Attack
'Kriptoanalis' only have a few ciphertext a result of tapping. But he does not know the key and the plaintext. Employment 'kriptoanalis' is seeking to obtain the decryption key the plaintext.

2. Known Plaintext Attack
Kriptoanalis successfully obtain discount
the plaintext and a full the chipertext, but he believes that both are interconnected. For example, the snippet plaintext which is believed to be a letter, because there is the phrase "respect us." 'Kriptoanalis' then try to match the chipertext that have meaning "respect us." Next task is to find a little of the decryption key information that he had it.

3. Candy Plaintext Attack
"Kriptoanalis" not only know a plaintext and the chipertext as in the case 2 above, but also free to choose some plaintext considered in accordance with a certain part of ciphertext. "Kriptoanalis" next task is to a key of guess.

4. Adaptive Candy Plaintext Attack. This attack is a special case of third type of attack mentioned above. "Kriptoanalis" not can only select a the plaintext would be encrypted, but can also modify choice based on the results of previous encryption. In Candy the plaintext attack, it may only can select a block for large plainteks encrypted, while he is on the attack can block plaintext choose a smaller and then select the other based on the results previously.

5. Candy Ciphertext Attack
"Kriptoanalis" can choose which ciphertext different decryption and to have access against plaintext encrypted. As example, "kriptoanalis" have access to the box
electronic process that can perform decryption automatically. Employment is "kriptoanalis" find the decryption key.

6. Candy Text
Candy is a combination of plaintext attack Candy and ciphertext attack. Here "kriptoanalis" already know the encryption algorithm ciphertext used and that will be read. "Kriptoanalis" can also select the plaintext would be cipherteks encrypted with spouse raised with a particular secret key.

Conditions that ensure security algorithms

There are 3 conditions that, when fulfilled by the algorithm cryptography, it will be able to guarantee security confidential communication made, namely:

1. If the cost to attack or penetrate cryptography algorithms that are used more than the price information will be obtained from results of these attacks. For example, the required computer system is worth 1 billion to penetrate algorithm that is used to protect information is worth 500 million.

2. When the time needed to penetrate algorithm is longer than the validation who wish to obtain information. For example, the time to penetrate a credit card is 1 years, whereas before the 1 year credit card is no longer valid.

3. When cipherteks produced by a cryptography algorithm less than cipherteks required to penetrate algorithm it. For example, of 1000 bits is required cipherteks to hit the key that is used on an algorithm, while the data resulting from the size of the encryption process is less of 1000 bits.

About Taher El Gamal (wikipedia)


Dr. Taher Elgamal

(born 18 August 1955) is an Egyptian cryptographer. Elgamal is sometimes written as El Gamal or ElGamal, but Elgamal is now preferred. In 1985, Elgamal published a paper titled A Public key Cryptosystem and A Signature Scheme based on discrete Logarithms in which he proposed the design of the ElGamal discrete log cryptosystem and of the ElGamal signature scheme. The latter scheme became the basis for Digital Signature Algorithm (DSA) adopted by National Institute of Standards and Technology (NIST) as the Digital Signature Standard (DSS). He also participated in the 'SET' credit card payment protocol, plus a number of Internet payment schemes.

Elgamal has gained a Bachelor of Science degree from Cairo University, and Masters and Doctorate degrees in Computer Science from Stanford University. He served as chief scientist at Netscape Communications from 1995 to 1998 where he was a driving force behind SSL. He also was the director of engineering at RSA SecuritySecurify in 1998 and becoming their CEO. When Securify was acquired by Kroll-O'Gara[1], a company providing Independent IT Controls measurement and software verification at the binary level. In October 2006 he joined Tumbleweed Communications [2] in a capacity of a Chief Technology Officer. Tumbleweed was acquired in 2008 by Axway Inc. He is an advisor to Onset Ventures, glenbrook partners, PGP corporation, Arcot Systems, Finjan, Facetime and serves as Chief Security Officer of Axway, Inc. Inc. before founding he became the president of its information security group. In 2008, Securify was acquired by Secure Computing and is now part of McAfee. In addition, Elgamal sits on the board of Vindicia, a company which provides online payment services as well as the Advisory Board of SignaCert, Inc.


Example of El Gamal Criptography in Number



















Picture above is an example of numerical calculation analogy ElGamal. Clark Kent was originally the public key (y), after he chose a private key is (x). Count y can be read in addition chaining post in this blog. after that Clark tells Lex Luthor key public.

Lex Luthor will send a character 'A' to Clark. Knowing Clark's public key and private key without knowing the property of Clark, he calculates chiperteks (a, b). ASCII characters with a code and b is sent to Clark. Clark chiperteks receive it. ago he was doing decryption by using a and b. plainteks that is' A ', the same as you want to send Lex Luthor

Tuesday, August 11, 2009

Number Of ElGamal



If all the prime factors of p - 1 are relatively
small, lots of cryptographic attacks are possible. Generally, primes p such that p-1 has a big prime
factor are much better.









Note: k; r can be computed before the message is seen. In addition, you need a new k and r
every time you sign a message. Otherwise, it will not be secure.







Digital Signature Algorithm (DSA)

Addition Chaining or Divide and Conquer for ElGamal Programing

Problems in the Calculations exponential and modulo in 'El Gamal' can be done using the manual counting or a calculator, but if the code is calculated in the program will generate a value of 0. this is because there is no data type in programming resource that can accommodate the data before these modulus. So to solve this problem some authors reduce the formula to calculate a and y.

vw mod p = [( v mod p )( w mod p )] mod p
example :



Method above in mathematics discreet technique called divide and conquer. Referred to as addition chaining technique because the results was carried out together with direct multiplication modulo operation. With this technique, the results will not reach a large number.

Besides the problem of y and a, the value of b also has a similar problem. Even in the calculation of b values before the integer modulus greater. This is because in addition to exponensial with a large number but also multiplied by the value of plain text blocks to mj. With divide and conquer method is also the formula b can be measured down

revealed to be:


From the equation above can be concluded that the value of mj is no need to be a factor during the process y^k mod p is calculated. Fair value multiplied later after y ^ k mod p is established, and then the new modulus with p more

ASCII TABLE FOR EMAIL APLICATION USE ELGAMAL KRIPTOGRAPHY

{“32”,” ”} 0
{“33”,”!”} 1
{“34”,”””} 2
{“35”,”#”} 3
{“36”,”$”} 4
{“37”,”%”} 5
{“38”,”&”} 6
{“39”,”’”} 7
{“40”,”(”} 8
{“41”,”)”} 9
{“42”,”*”} 10
{“43”,”+”} 11
{“44”,”,”} 12
{“45”,”-”} 13
{“46”,”.”} 14
{“47”,”/”} 15
{“48”,”0”} 16
{“49”,”1”} 17
{“50”,”2”} 18
{“51”,”3”} 19
{“52”,”4”} 20
{“53”,”5”} 21
{“54”,”6”} 22
{“55”,”7”} 23
{“56”,”8”} 24
{“57”,”9”} 25
{“58”,”:”} 26
{“59”,”;”} 27
{“60”,”<”} 28 {“61”,”=”} 29 {“62”,”>”} 30
{“63”,”?”} 31
{“64”,”@”} 32
{“65”,”A”} 33
{“66”,”B”} 34
{“67”,”C”} 35
{“68”,”D”} 36
{“69”,”E”} 37
{“70”,”F”} 38
{“71”,”G”} 39
{“72”,”H”} 40
{“73”,”I”} 41
{“74”,”J”} 42
{“75”,”K”} 43
{“76”,”L”} 44
{“77”,”M”} 45
{“78”,”N”} 46
{“79”,”O”} 47
{“80”,”P”} 48
{“81”,”Q”} 49
{“82”,”R”} 50
{“83”,”S”} 51
{“84”,”T”} 52
{“85”,”U”} 53
{“86”,”V”} 54
{“87”,”W”} 55
{“88”,”X”} 56
{“89”,”Y”} 57
{“90”,”Z”} 58
{“91”,”[”} 59
{“92”,”\”} 60
{“93”,”]”} 61
{“94”,”^”} 62
{“95”,”_”} 63
{“96”,”`”} 64
{“97”,”a”} 65
{“98”,”b”} 66
{“99”,”c”} 67
{“100”,”d”} 68
{“101”,”e”} 69
{“102”,”f”} 70
{“103”,”g”} 71
{“104”,”h”} 72
{“105”,”i”} 73
{“106”,”j”} 74
{“107”,”k”} 75
{“108”,”l”} 76
{“109”,”m”} 77
{“110”,”n”} 78
{“111”,”o”} 79
{“112”,”p”} 80
{“113”,”q”} 81
{“114”,”r”} 82
{“115”,”s”} 83
{“116”,”t”} 84
{“117”,”u”} 85
{“118”,”v”} 86
{“119”,”w”} 87
{“120”,”x”} 88
{“121”,”y”} 89
{“122”,”z”} 90
{“123”,”{”} 91
{“124”,”|”} 92
{“125”,”}”} 93
{“126”,”~”} 94
{“127”,”DEL”} 95
{“128”,”Ç”} 96
{“129”,”ü”} 97
{“130”,”é”} 98
{“131”,”â”} 99
{“132”,”ä”} 100
{“133”,”à”} 101
{“134”,”å”} 102
{“135”,”ç”} 103
{“136”,”ê”} 104
{“137”,”ë”} 105
{“138”,”è”} 106
{“139”,”ї”} 107
{“140”,”î”} 108
{“141”,”ì”} 109
{“142”,”Ä”} 110
{“143”,”Å”} 111
{“144”,”É”} 112
{“145”,”æ”} 113
{“146”,”Æ”} 114
{“147”,”ô”} 115
{“148”,”ö”} 116
{“149”,”ò”} 117
{“150”,”û”} 118
{“151”,”ù”} 119
{“152”,”_”} 120
{“153”,”Ö”} 121
{“154”,”Ü”} 122
{“155”,”blank”} 123
{“156”,”£”} 124
{“157”,”¥”} 125
{“158”,”_”} 126
{“159”,”ƒ”} 127
{“160”,”á”} 128
{“161”,”í”} 129
{“162”,”ó”} 130
{“163”,”ú”} 131
{“164”,”ñ”} 132
{“165”,”Ñ”} 133
{“166”,”ª”} 134
{“167”,”°”} 135
{“168”,”¿”} 136
{“169”,”_”} 137
{“170”,”¬”} 138
{“171”,”½”} 139
{“172”,”¼”} 140
{“173”,”¡”} 141
{“174”,”«”} 142
{“175”,”»”} 143
{“176”,”░”} 144
{“177”,”▒”} 145
{“178”,”▓”} 146
{“179”,”│”} 147
{“180”,”┤”} 148
{“181”,”╡”} 149
{“182”,”╢”} 150
{“183”,”╖”} 151
{“184”,”╕”} 152
{“185”,”╣”} 153
{“186”,”║”} 154
{“187”,”╗”} 155
{“188”,”╝”} 156
{“189”,”╜”} 157
{“190”,”╛”} 158
{“191”,”┐”} 159
{“192”,”└”} 160
{“193”,”┴”} 161
{“194”,”┬”} 162
{“195”,”├”} 163
{“196”,”─”} 164
{“197”,”┼”} 165
{“198”,”╞”} 166
{“199”,”╟”} 167
{“200”,”╚”} 168
{“201”,”╔”} 169
{“202”,”╩”} 170
{“203”,”╦”} 171
{“204”,”╠”} 172
{“205”,”═”} 173
{“206”,”╬”} 174
{“207”,”╧”} 175
{“208”,”╨”} 176
{“209”,”╤”} 177
{“210”,”╥”} 178
{“211”,”╙”} 179
{“212”,”╘”} 180
{“213”,”╒”} 181
{“214”,”╓”} 182
{“215”,”╫”} 183
{“216”,”╪”} 184
{“217”,”┘”} 185
{“218”,”┌”} 186
{“219”,”█”} 187
{“220”,”▄”} 188
{“221”,”▌”} 189
{“222”,”▐”} 190
{“223”,”▀”} 191
{“224”,”α”} 192
{“225”,”β”} 193
{“226”,”Γ”} 194
{“227”,”π”} 195
{“228”,”∑”} 196
{“229”,”σ”} 197
{“230”,”μ”} 198
{“231”,”τ”} 199
{“232”,”Φ”} 200
{“233”,”Θ”} 201
{“234”,”Ω”} 202
{“235”,”δ”} 203
{“236”,”∞”} 204
{“237”,”ф”} 205
{“238”,”ε”} 206
{“239”,”∩”} 207
{“240”,”≡”} 208
{“241”,”±”} 209
{“242”,”≥”} 210
{“243”,”≤”} 211
{“244”,”⌠”} 212
{“245”,”⌡”} 213
{“246”,”÷”} 214
{“247”,”≈”} 215
{“248”,”°”} 216
{“249”,”.”} 217
{“250”,”.”} 218
{“251”,”√”} 219
{“252”,”_”} 220
{“253”,”²”} 221
{“254”,”■”} 222


Description:
( "Xxxx", "Y") = aaa
Xxx = ASCII code beginning
Y = the characters in the code
Aaa = new code because the data migration code 0 - 31 not used
Number of characters = 223 is the same as the p value is 223 (prime and range data)

Sunday, August 9, 2009

Respected, or in Fear ?



Some people valued because people fear him, but people have appreciated that because other people really respect him. What is the difference between respect and fear? if respect because they fear a later time when we are weak then that person will no longer fear us. However, if you appreciated the honor because in any situation, people will still appreciate.

I am confused, what I appreciated because other people fear to me, or because of the respect? or may be more severe than the "required"?

Sunday, August 2, 2009

The El Gamal Cryptosystem

We have seen that the security of the RSA Cryptosystem is related to the difficulty of
factoring large numbers. It is possible to construct Cryptosystems based on other difficult
number-theoretic problems. We now consider the El Gamal Cryptosystem, named after its
inventor, 'Taher El Gamal', which is based on the difficulty of a problem called the \discrete
logarithm."

Calendar