I don’t know why the encodings of major search engines are actually different now. Of course, it’s either gb2312 or utf-8. The encoding problem is a headache... It’s a headache...
We get keywords, usually through visited pages. URL is analyzed. For example,
http://www.google.com/search?hl=zh-CN&q=%E5%AD%A4%E7%8B%AC&lr =
You must know that this is encoded by urlencode.
We get Information needs to be carried out in 2 steps. The first step is to perform urldecoding. When our ordinary parameters are valid, this is performed by asp itself, but now we have to perform manual decoding.
There are many functions on the Internet, but they are all for Decode gb2312.utf-8 on the gb2312 page. For this, we can easily decode it first, and then judge its encoding according to the search engine. If it is utf-8, then convert it to gb2312.
But since my website is utf -8 page. As for the utf-8 page, the only thing I found was the urldecode encoding of utf-8 characters. I paused here for a long time. In the end, I could only use the worst method, submitting the separated keywords using xmlhttp. Go to a gb2312 asp page, and then convert gb2312 to utf-8 after garbled code (gb2312).
The following is the main implementation code.
Public Function GetSearchKeyword(RefererUrl) 'Search for keywords
if RefererUrl="" or len(RefererUrl)<1 then exit function
on error resume next
Dim re
Set re = New RegExp
re.IgnoreCase = True
re.Global = True
Dim a,b,j
'Fuzzy search keywords, this method is faster and has a larger range
re.Pattern = "(word=([^&]*)|q=([^&]*)|p=([^&]*)|query=([^&]*)|name=([ ^&]*)|_searchkey=([^&]*)|baidu.*?w=([^&]*))"
Set a = re.Execute(RefererUrl)
If a.Count>0 then
Set b = a(a.Count-1).SubMatches
For j=1 to b.Count
If Len(b(j))>0 then
if instr(1,RefererUrl,"google",1) then
GetSearchKeyword=Trim(U8Decode(b(j)))
elseif instr(1,refererurl,"yahoo",1) then
GetSearchKeyword=Trim(U8Decode(b(j)))
elseif instr(1,refererurl,"yisou",1) then
GetSearchKeyword=Trim(getkey(b(j)))
elseif instr(1,refererurl,"3721",1) then
GetSearchKeyword=Trim(getkey(b(j)))
else
GetSearchKeyword=Trim(getkey(b(j)))
end if
Exit Function
end if
Next
End If
if err then
err.clear
GetSearchKeyword = RefererUrl
else
GetSearchKeyword = ""
end if
End Function
Function URLEncoding(vstrIn)
dim strReturn,i,thischr
strReturn = ""
For i = 1 To Len(vstrIn)
ThisChr = Mid(vStrIn,i,1)
If Abs(Asc(ThisChr)) < &HFF Then
strReturn = strReturn & ThisChr
Else
innerCode = Asc(ThisChr)
If innerCode < 0 Then
innerCode = innerCode + &H10000
End If
Hight8 = (innerCode And &HFF00) &HFF
Low8 = innerCode And &HFF
strReturn = strReturn & "%" & Hex(Hight8) & "%" & Hex(Low8)
End If
Next
URLEncoding = strReturn
End Function
function getkey(key)
dimoReq
set oReq = CreateObject("MSXML2.XMLHTTP")
oReq.open "POST"," http://"&WebUrl&"/system/ShowGb2312XML.asp?a="&key,false
oReq.send
getkey=UTF2GB(oReq.responseText)
end function
function chinese2unicode(Str)
dimi
dim Str_one
dimStr_unicode
for i=1 to len(Str)
Str_one=Mid(Str,i,1)
Str_unicode=Str_unicode&chr(38)
Str_unicode=Str_unicode&chr(35)
Str_unicode=Str_unicode&chr(120)
Str_unicode=Str_unicode& Hex(ascw(Str_one))
Str_unicode=Str_unicode&chr(59)
next
Response.Write Str_unicode
end function
function UTF2GB(UTFStr)
Dim dig,GBSTR
for Dig=1 to len(UTFStr)
if mid(UTFStr,Dig,1)="%" then
if len(UTFStr) >= Dig+8 then
GBStr=GBStr & ConvChinese(mid(UTFStr,Dig,9))
Dig=Dig+8
else
GBStr=GBStr & mid(UTFStr,Dig,1)
end if
else
GBStr=GBStr & mid(UTFStr,Dig,1)
end if
next
UTF2GB=GBStr
end function
function ConvChinese(x)
dim a,i,j,DigS,Unicode
A=split(mid(x,2),"%")
i=0
j=0
for i=0 to ubound(A)
A(i)=c16to2(A(i))
next
for i=0 to ubound(A)-1
DigS=instr(A(i),"0")
Unicode=""
for j=1 to DigS-1
if j=1 then
A(i)=right(A(i),len(A(i))-DigS)
Unicode=Unicode & A(i)
else
i=i+1
A(i)=right(A(i),len(A(i))-2)
Unicode=Unicode & A(i)
end if
next
if len(c2to16(Unicode))=4 then
ConvChinese=ConvChinese & chrw(int("&H" & c2to16(Unicode)))
else
ConvChinese=ConvChinese & chr(int("&H" & c2to16(Unicode)))
end if
next
end function
function U8Decode(enStr)
'Input a bunch of strings separated by %, first divide them into arrays, and judge the completion rules according to utf8 rules
'Input: Off E5 85 B3 key E9 94 AE word E5 AD 97
'Output: Off B9D8 key BCFC word D7D6
dim c,i,i2,v,deStr,WeiS
for i=1 to len(enStr)
c=Mid(enStr,i,1)
if c="%" then
v=c16to2(Mid(enStr,i+1,2))
'Determine the position where 0 appears for the first time,
'may be 1 (single byte), 3 (3-1 byte), 4, 5, 6, 7 cannot be 2 and greater than 7
'Theoretically it reaches 7, but in practice it will not exceed 3.
WeiS=instr(v,"0")
v=right(v,len(v)-WeiS)'The first one removes the leftmost WeiS
i=i+3
for i2=2 to WeiS-1
c=c16to2(Mid(enStr,i+1,2))
c=right(c,len(c)-2)'Remove the two leftmost ones from the rest
v=v & c
i=i+3
next
if len(c2to16(v)) =4 then
deStr=deStr & chrw(c2to10(v))
else
deStr=deStr & chr(c2to10(v))
end if
i=i-1
else
if c="+" then
deStr=deStr&" "
else
deStr=deStr&c
end if
end if
next
U8Decode = deStr
end function
function c16to2(x)
'This function is used to convert hexadecimal to binary. It can be of any length. Generally, when converting UTF-8, it is two lengths, such as A9
'For example: input "C2", it will be converted into "11000010", where 1100 is "c" which is 12 (1100) in decimal, then 2 (10) must be completed with 4 digits less than 4 to become (0010).
dimtempstr
dim i:i=0'temporary pointer
for i=1 to len(trim(x))
tempstr= c10to2(cint(int("&h" & mid(x,i,1))))
do while len(tempstr)<4
tempstr="0" & tempstr'If there are less than 4 digits, then fill in 4 digits
loop
c16to2=c16to2 & tempstr
next
end function
function c2to16(x)
'Conversion from binary to hexadecimal, every 4 0 or 1 is converted into a hexadecimal letter, of course the input length cannot be a multiple of 4
dim i:i=1' temporary pointer
for i=1 to len(x) step 4
c2to16=c2to16 & hex(c2to10(mid(x,i,4)))
next
end function
function c2to10(x)
'Simple binary to decimal conversion, does not consider the 4 leading zero padding required for conversion to hexadecimal.
'Because this function is very useful! It will be used in the future, and people who have done communication and hardware should know it.
'String is used here to represent binary
c2to10=0
if x="0" then exit function'If it is 0, just get 0 and be done.
dim i:i=0'temporary pointer
for i= 0 to len(x) -1' Otherwise, use 8421 code to calculate. This has been known since I first started learning computers. I miss Mr. Xie Daojian who taught us so much!
if mid(x,len(x)-i,1)="1" then c2to10=c2to10+2^(i)
next
end function
function c10to2(x)
'Conversion from decimal to binary
dim sign, result
result = ""
'symbol
sign = sgn(x)
x = abs(x)
if x = 0 then
c10to2 = 0
exit function
end if
do until x = "0"
result = result & (x mod 2)
x = x2
loop
result = strReverse(result)
if sign = -1 then
c10to2 = "-" & result
else
c10to2 = result
end if
end function
function URLDecode(enStr)
dim deStr,strSpecial
dim c,i,v
deStr=""
strSpecial="!""#$%&'()*+,/:;<=>?@[]^`{ |}~%"
for i=1 to len(enStr)
c=Mid(enStr,i,1)
if c="%" then
v=eval("&h"+Mid(enStr,i+1,2))
if inStr(strSpecial,chr(v))>0 then
deStr=deStr&chr(v)
i=i+2
else
v=eval("&h"+Mid(enStr,i+1,2)+Mid(enStr,i+4,2))
deStr=deStr&chr(v)
i=i+5
end if
else
if c="+" then
deStr=deStr&" "
else
deStr=deStr&c
end if
end if
next
URLDecode=deStr
End function
Many codes are online. The author cannot be found.
PS: I have to accept the summer vacation now. Due to family reasons, I don’t want to stay in my city. The high school entrance examination has reached the local focus. I don’t want to say the name of the city. Otherwise, I will attract acquaintances. As long as I am not here Can you contact the school in Shandong that is considered a key point?
QQ: 32113739
I am very interested in programming, but I can only get the first-class X name in the Information Olympiad. Because I think technology should not be reflected in the so-called competition, just like talent should not The performance is the same in those meaningless exams. I also got the first place in each province in my electronic works... but it's just average. My studies are average... so as long as it's the general focus... I just don't want to be too close to home.
Now ASP is very proficient, although there are some knowledge deficiencies, such as coding problems (sweat...), but the network is so big, I think I can not only get the so-called knowledge from textbooks. And now I am reading the ASP.NET book, if it is expensive The school website can definitely help.
I am very enthusiastic about new technologies. Although I am called a person with aesthetic impairment. But I want to see the structure of my program without vomiting blood.
Forget it... More posts.
I developed D Database +asp ->xml+xslt->xhtml +css is something called CMS.
also uses the FCK editor used by CSDN. I only found out that it has been changed when I came up today. But the FCK FIle system Let me change everything.
This system will definitely be released before the end of the summer vacation. However, many friends say that there are problems with ease of use... many people don't know how to xslt. Sigh...
If I can't find the school, I might I will wander, maybe disappear. Of course this is not a threat... I just hate my city, hate everything I see and do there.