CN101814118A - Method for protecting web texts based on pictures - Google Patents

Method for protecting web texts based on pictures Download PDF

Info

Publication number
CN101814118A
CN101814118A CN200910023185A CN200910023185A CN101814118A CN 101814118 A CN101814118 A CN 101814118A CN 200910023185 A CN200910023185 A CN 200910023185A CN 200910023185 A CN200910023185 A CN 200910023185A CN 101814118 A CN101814118 A CN 101814118A
Authority
CN
China
Prior art keywords
picture
text
literal
information
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200910023185A
Other languages
Chinese (zh)
Inventor
王黎明
李晓东
刘西洋
秦英
姚丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN200910023185A priority Critical patent/CN101814118A/en
Publication of CN101814118A publication Critical patent/CN101814118A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses network security method for protecting web texts based on pictures. The method mainly solves the problem that the existing method for protecting web texts is large in transmission quantity and poor in security. The method is characterized by randomly disarranging the texts at the server side, selecting a small part of words in the disarranged texts to generate pictures, then encrypting the rest of the words in the texts and the coordinate information of each word, transmitting the information and the pictures to the client side, using the pictures transmitted by the client side as the background pictures, decrypting the encrypted information by the client side, the client side generating each word in the text information obtained after decryption into a corresponding small picture and superposing the small picture in the proper position of the background pictures according to the position of the word in the original text, thereby obtaining the picture containing the whole text information and displaying the picture in a browser. The method controls text duplicating and can effectively protect the copyright and the benefits of the author.

Description

Method for protecting web texts based on picture
Technical field
The invention belongs to technical field of the computer network, relate to the text guard method of webpage, be used for the text protection of various information equipments such as computer, mobile phone.
Background technology
Along with popularizing of information equipments such as computer, mobile phone, the online literature works also more and more become a kind of trend.A lot of copyright and the interests of online literature works issuing web site in order to protect the author need user charges can read the works of this website.
But for these paying literary works websites, after as long as a certain reader pays and enters, by checking the page source code, even only by " duplicate---paste " page text, just can easily reprint these literary works contents in other webpage, more often reprint in the blog space of oneself.A lot of like this people that want to read these literary works do not need to enter relevant charge website, just can read literary works and only visit these spaces.This provides very big convenience for the reader, can freely read these works.But, not only damaged literary works authors' rights and interests, and allowed their income significantly reduce for these charge websites.
Each big website mostly adopts the mode of picture to carry out the text protection at present, and after promptly the author submitted works on the net, server end generated picture, will generate good picture when the user browses then and be sent to client.This technology can be protected text preferably, but the shortcoming picture is big more than text, causes transmission volume to increase greatly, has brought very big inconvenience for the user browses especially mobile phone user.Though the people who also has some to be engaged in secure context research has in addition proposed a lot of methods and has protected these texts; for example; in browser, pack into and upset the plug-in unit of page order; like this; when the reader wants when checking that the page source code obtains text, what see is text after out of order, thereby gives the method for " duplicate---paste " text in the past; brought one difficulty, but this only cures the symptoms, not the disease.The reader just can analyze the out of order algorithm of these texts, at this moment at an easy rate as long as programming knowledge is arranged slightly, as long as again with identical way, " give someone a taste of his own medicine ", the function reverse of plug-in unit is come, just can restore page text easily.
Along with the continuous development of infotech, how to protect the income of website to be without prejudice, and online literature works author's rights and interests are not encroached on yet, more and more paid close attention to.
Summary of the invention
The object of the invention is to overcome existing big to the transmission quantity of method for protecting web texts; the deficiency of poor stability; a kind of method for protecting web texts based on picture has been proposed; to prevent that the user from obtaining and propagating the content of page text; protecting network literary works author's copyright and interests reduce the loss of website simultaneously effectively.
To achieve these goals, text guard method provided by the invention comprises:
(1) obtains content of text at the Web server end, upset the text order;
(2) the fraction content in the picked at random text generates picture;
(3) with not generating the residue content of picture in the text, encrypt with the corresponding coordinate information of each literal, and compression;
(4) information after will compressing, and the picture that is generated is saved in the html page, is transferred to client;
(5) after client receives the html page, obtain picture wherein, picture as a setting, and the compressed information that receives carried out decompress(ion) and deciphering, restore raw information;
(6) client becomes pixel with each literal interpretation in the raw information of reduction, generate a little picture that only comprises this literal, and according to the coordinate information of this literal, picture is added on the Background, the full page that comprises complete Word message the most at last is shown to the user.
The present invention has following advantage:
1) the present invention has reduced the quantity of information when transmitting because the medium and small partial content of picked at random text generates picture;
2) the present invention is because out of order with text, and text is remained content and coordinate ciphered compressed, makes in the transmission course difficulty that becomes of the recovery after the information acquisition, simultaneously, effectively prevented by checking that the page source code obtains text;
3) the present invention effectively prevents OCR identification owing to the contents such as font that add in interfere information and the standard character library in Background, further protects webpage text content;
4) the present invention effectively prevents to obtain the content of propagating page text by " copy-paste " because remaining each literal interpretation of text is become pixel.
Description of drawings
Fig. 1 is existing browser server end and client interaction figure;
Fig. 2 is the process flow diagram of the present invention at the server-side processes text;
Fig. 3 is the process flow diagram of the present invention at the client process text.
Fig. 4 is the picture that server end picked at random text fraction character generates in the experiment embodiment of the present invention;
Fig. 5 is that client generates " driving " word corresponding character picture in the experiment embodiment of the present invention;
Fig. 6 is the final literal picture that generates of client in the experiment embodiment of the present invention.
Embodiment
Further specify technical scheme of the invention process below in conjunction with accompanying drawing and concrete enforcement.
Fig. 1 is web browser its working principles figure.The first half is represented the relation of client computer and server interaction; The latter half diagram is then represented the relation of browser and web server interaction.Wherein, browser at first sends request to the web server, and the web server is made response to request, and response data is sent to client browser, and normally the html file is shown to the html fileinfo on the user side screen by browser then.This is the most basic network application principle.Web server process module wherein will be handled text, generate the required Pixel Information of client synthesising picture.
With reference to Fig. 2, the present invention comprises the steps: in the workflow of server end
Step 1 is obtained content of text at the Web server end, upsets the text order.
Ask when the user sends URL, in the time of browsing the literary works content of certain web page, browser sends to server end with this request.After server end receives this request, search the literary works content that comprises in webpage that the user will browse and this page.According to the information that obtains, extract the literary works content, in order to increase the level of security of text protection, text is carried out out of order operation.This out of order operation be according to user account, login time and IP address as random seed, upset the original order of text.
Step 2, the fraction content in the picked at random text generates picture.
The text after out of order, picked at random fraction content, and from browser server end character library, select certain font at random, generate the Pixel Information of selected content, synthesising picture.
Wherein, in order to hinder OCR identification, in the process that generates pixel, disturbing factors such as some image plus noises have been added.Specifically, in the process that generates Pixel Information, suitably add the background pixel point, disturb pixel, noise line, and literal is added shade, mutilation body, adds the word in the standard character library, and suitably add the font in the self-made characters body storehouse.
Step 3 with not generating the residue content of picture in the text, is encrypted with the corresponding coordinate information of each literal, and compression.
In order to guarantee when client is reduced urtext, to know the position of each literal exactly, before out of order, need to note the coordinate of each literal in urtext.
At first, obtain the residue text and reach the wherein pixel coordinate information of each literal in the generation picture,, need encrypt this information in order to guarantee the transmission safety row; Then,, further compress again for the information after encrypting, transmission again, thus improved transfer efficiency, help reader to read.
According to the operating characteristic of browser, guaranteed efficiency, will make major part be operated in browser client and finish, and on the other hand,, require server end to take certain measure again in order to improve the security of Network Transmission.Thereby, in order to take into account the safety of network transmission efficiency and Network Transmission, take the fraction text to transmit as picture, most of measure with the text transmission makes the two can obtain better combination.
Step 4, with the information after the compression, and the picture that is generated is saved in the html page, is transferred to client.
Picture with the needs generation; and the sensitive information after the ciphered compressed is written to the html page; and further be transferred to client browser; by this processing; in the process of Network Transmission; even the html page that is transmitted is intercepted and captured; resulting also only one comprise the picture of the medium and small segment word of text and Word message and the coordinate information that the quilt after the ciphered compressed is upset order; so information after above-mentioned will the compression; and the picture that is generated is saved in earlier in the html page, and this disposal route that is transferred to client again can effectively be protected page text.
With reference to figure 3, the present invention comprises the steps: in the treatment scheme of browser
Steps A receives the html page that server end sends, and parses key message wherein.
In a single day client browser receives the html information that the web server sends over, will analyze content of pages, parse the crucial sensitive information in the webpage, these information comprise the pictorial information of generation, the residue text message and the coordinate information of ciphered compressed.
Step B obtains picture picture as a setting.
According to the information that obtains in the steps A, extract picture wherein, this picture is that server end step 2 generated has the fraction literal and prevents the picture of the interfere information of OCR identification, with picture as the background picture in the customer terminal webpage.
Step C obtains residue Word message wherein, and it is carried out decompress(ion) and deciphering.
According to the Word message that obtains in the steps A, then at first be responsible for decompress(ion) by browser client, then according to encryption key, decrypt out of order text message.
Step D generates the picture of each literal.
Browser client reads each literal in the information successively, and it is construed to Pixel Information, generates the picture that comprises this literal.In the process that generates picture, the same technology that hinders OCR identification of utilizing, promptly in the process that generates the text Pixel Information, text is added the image plus noises of various obstruction OCR identification, comprise the background pixel point, disturb pixel, noise line, literal add shade, mutilation body, add the word in the standard character library and suitably add font in the self-made characters body storehouse.
Step e according to the position of each literal in picture, reads the coordinate information of the little picture of each literal in background picture, according to this coordinate, every little picture is added on the relevant position of background picture, obtains comprising the picture webpage of full copy information, and give the user this web displaying.
Effect of the present invention can further specify by following experiment embodiment:
1. the urtext of this experimental selection is " in the little a compound occupied by many households of interior emperor's diet room; because some sees that the people of dinner party does not also return; some is but slept down early; thereby whole courtyard is all dark; quiet to the utmost point quiet; as to touch the black room of wanting to give for change oneself, but again and again attacked at heart by a kind of fear, the broken snow that falls once in a while also can be shied me, two bunches of glittering lights sparkle, be unique interdependent seemingly in such cold night, for each other radiance exist, if there is one to go out, in another small cup ... the capital is permanent lonely in the dark ... transmit deserted and lonely sounding the night watches far away, a unexpected sound, at heart also as being frightened, indistinct to trembling ... I do not know what oneself is fearing, but have a hunch a kind of atmosphere of constraining gradually to the limit, begin to split and be dispersed in this Forbidden City, some thing really is premonitory ... be that I have thought suddenly like this after how long having crossed? " extract the coordinate of literal then, at last with user account, the original order of text is upset as random seed in login time and IP address.
2. at server end, the fraction content of this experiment picked at random text for " the interior food for the emperor is little, and put in order the courtyard black because of dinner be that only this existence is gone to planting the also bright sample of idol snow; the distant sound prominent acoustic shock of sounding the night watches in being; war gradually begin very pre-in the city ... " totally 46 words and punctuation mark, generate picture, in the picture that generates, add the interfere information that hinders OCR identification, used the font in java standard library and the self-made characters body storehouse, as shown in Figure 4.As seen from Figure 4, this experiment picked at random 46 words and punctuation mark, and added the interfere information that hinders OCR identification and used font in java standard library and the self-made characters body storehouse, finished fraction text generation picture.
3. compression and encrypt out of order literal and coordinate information then, is kept at pictorial information and compressed information in the html page and returns to client.
4. client receives the information that server sends, and obtains pictorial information picture as a setting wherein, and decompress(ion) and decryption information, obtains out of order literal and coordinate information.
5. read each literal in the information successively, generate the little picture of literal that has interfere information one by one, as " driving " word, its picture as shown in Figure 5.As seen from Figure 5, this experiment is generated as the corresponding character picture with " driving " word, and has added interfere information.
6 read the respective coordinates of literal in background picture, are superimposed upon on the Background according to the picture of coordinate information with each literal correspondence.When realizing, we the upper left corner coordinate of each literal place background picture region as the position of this literal in picture.After being added to the picture of each literal correspondence on the Background, can see the picture that comprises whole text message, as shown in Figure 6.As seen from Figure 6, this experiment is finished literal is generated picture and adds the interference pixel, has used font in standard character library and the self-made characters storehouse, effectively hinders OCR identification, and the information of preventing is usurped, and has protected web page text.
Can draw by this experiment; the method for protecting web texts that the present invention proposes based on picture; not only can improve the efficient of transmission, and also guarantee safety of transmission to a certain extent, thereby ensure the network author effectively and the rights and interests of the website of charging.

Claims (4)

1. the method for protecting web texts based on picture comprises the steps:
(1) obtains content of text at the Web server end, upset the text order;
(2) the fraction content in the picked at random text generates picture;
(3) with not generating the residue content of picture in the text, encrypt with the corresponding coordinate information of each literal, and compression;
(4) information after will compressing, and the picture that is generated is saved in the html page, is transferred to client;
(5) after client receives the html page, obtain picture wherein, picture as a setting, and the compressed information that receives carried out decompress(ion) and deciphering, restore raw information;
(6) client becomes pixel with each literal interpretation in the raw information of reduction, generate a little picture that comprises this literal, and according to the coordinate information of this literal, picture is added on the Background, the full page that comprises complete Word message the most at last is shown to the user.
2. the method for protecting web texts based on picture according to claim 1; the described text order of upsetting of step (1) wherein; be with user account, login time and visit IP address as random number seed, text is carried out out of order arrangement, make crack the text original contents become the difficulty.
3. the method for protecting web texts based on picture according to claim 1; the fraction content in the described picked at random text of step (2) wherein; generate picture; when specific implementation; the noise that in the text picture, will add various obstruction OCR identification at random; comprise the background pixel point, disturb pixel, noise line and literal to add shade, used the font in standard character library and self-made characters storehouse simultaneously.
4. the method for protecting web texts based on picture according to claim 1; wherein the described client of step (5) becomes pixel with each literal interpretation in the raw information of reduction; generate a little picture that comprises this literal; when specific implementation; the noise that will add various obstruction OCR identification in the little picture of each literal at random comprises the background pixel point, disturbs pixel, noise line and literal to add shade.
CN200910023185A 2009-07-02 2009-07-02 Method for protecting web texts based on pictures Pending CN101814118A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200910023185A CN101814118A (en) 2009-07-02 2009-07-02 Method for protecting web texts based on pictures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910023185A CN101814118A (en) 2009-07-02 2009-07-02 Method for protecting web texts based on pictures

Publications (1)

Publication Number Publication Date
CN101814118A true CN101814118A (en) 2010-08-25

Family

ID=42621370

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910023185A Pending CN101814118A (en) 2009-07-02 2009-07-02 Method for protecting web texts based on pictures

Country Status (1)

Country Link
CN (1) CN101814118A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541863A (en) * 2010-12-14 2012-07-04 联芯科技有限公司 Webpage compression method applied to mobile terminal
WO2012122769A1 (en) * 2011-03-16 2012-09-20 中兴通讯股份有限公司 Browser and method thereof for adding and displaying comments about web picture
CN102929868A (en) * 2011-08-01 2013-02-13 北京百度网讯科技有限公司 Random sequencing method and system for page-turning browsing
CN103955632A (en) * 2014-05-07 2014-07-30 百度在线网络技术(北京)有限公司 Encryption display method and device for webpage words
CN104217136A (en) * 2013-06-05 2014-12-17 北京齐尔布莱特科技有限公司 Method and system for preventing web page text message from being captured automatically
WO2016034068A1 (en) * 2014-09-03 2016-03-10 阿里巴巴集团控股有限公司 Sensitive information processing method, device, server and security determination system
CN106446617A (en) * 2016-09-21 2017-02-22 河南科技大学 Static webpage access method with active file protection function
CN107038387A (en) * 2017-01-04 2017-08-11 阿里巴巴集团控股有限公司 A kind of method for exhibiting data, device and client
CN108984764A (en) * 2018-07-20 2018-12-11 广东巴拿赫大数据科技有限公司 " a key safety " wechat small routine backstage middleware system and processing method based on mobile terminal
CN109446490A (en) * 2018-09-13 2019-03-08 杭州索骥数据科技有限公司 Method for previewing, generation method and the processing method of pdf document
CN110502711A (en) * 2019-07-17 2019-11-26 汉海信息技术(上海)有限公司 Page display method, device, electronic equipment and readable storage medium storing program for executing
CN110650148A (en) * 2019-09-30 2020-01-03 广西科技大学 Information security transmission system based on random encryption
CN111753494A (en) * 2020-07-06 2020-10-09 浪潮卓数大数据产业发展有限公司 Woff font decryption method and system based on selenium
CN111831975A (en) * 2020-07-14 2020-10-27 郑凯璇 Network novel display method for preventing piracy, computer equipment and storage medium

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541863B (en) * 2010-12-14 2015-08-05 联芯科技有限公司 A kind of Webpage compression method being applied to mobile terminal
CN102541863A (en) * 2010-12-14 2012-07-04 联芯科技有限公司 Webpage compression method applied to mobile terminal
WO2012122769A1 (en) * 2011-03-16 2012-09-20 中兴通讯股份有限公司 Browser and method thereof for adding and displaying comments about web picture
US9563614B2 (en) 2011-03-16 2017-02-07 Zte Corporation Browser and method for adding and displaying web picture comment
CN102929868A (en) * 2011-08-01 2013-02-13 北京百度网讯科技有限公司 Random sequencing method and system for page-turning browsing
CN104217136A (en) * 2013-06-05 2014-12-17 北京齐尔布莱特科技有限公司 Method and system for preventing web page text message from being captured automatically
CN104217136B (en) * 2013-06-05 2017-05-03 北京齐尔布莱特科技有限公司 Method and system for preventing web page text message from being captured automatically
CN103955632B (en) * 2014-05-07 2018-03-06 百度在线网络技术(北京)有限公司 The encryption display methods and device of webpage word
CN103955632A (en) * 2014-05-07 2014-07-30 百度在线网络技术(北京)有限公司 Encryption display method and device for webpage words
WO2016034068A1 (en) * 2014-09-03 2016-03-10 阿里巴巴集团控股有限公司 Sensitive information processing method, device, server and security determination system
US10505934B2 (en) 2014-09-03 2019-12-10 Alibaba Group Holding Limited Sensitive information processing method, device and server, and security determination system
CN106446617B (en) * 2016-09-21 2018-11-27 河南科技大学 A kind of static page access method with source file defencive function
CN106446617A (en) * 2016-09-21 2017-02-22 河南科技大学 Static webpage access method with active file protection function
CN107038387A (en) * 2017-01-04 2017-08-11 阿里巴巴集团控股有限公司 A kind of method for exhibiting data, device and client
CN108984764A (en) * 2018-07-20 2018-12-11 广东巴拿赫大数据科技有限公司 " a key safety " wechat small routine backstage middleware system and processing method based on mobile terminal
CN108984764B (en) * 2018-07-20 2022-03-15 广东巴拿赫大数据科技有限公司 One-key safety WeChat applet background middleware system based on mobile terminal and processing method
CN109446490A (en) * 2018-09-13 2019-03-08 杭州索骥数据科技有限公司 Method for previewing, generation method and the processing method of pdf document
CN109446490B (en) * 2018-09-13 2023-07-21 杭州索骥数据科技有限公司 Previewing method, generating method and processing method of PDF (portable document format) file
CN110502711A (en) * 2019-07-17 2019-11-26 汉海信息技术(上海)有限公司 Page display method, device, electronic equipment and readable storage medium storing program for executing
CN110502711B (en) * 2019-07-17 2022-08-26 汉海信息技术(上海)有限公司 Page display method and device, electronic equipment and readable storage medium
CN110650148A (en) * 2019-09-30 2020-01-03 广西科技大学 Information security transmission system based on random encryption
CN110650148B (en) * 2019-09-30 2021-09-21 广西科技大学 Information security transmission system based on random encryption
CN111753494A (en) * 2020-07-06 2020-10-09 浪潮卓数大数据产业发展有限公司 Woff font decryption method and system based on selenium
CN111831975A (en) * 2020-07-14 2020-10-27 郑凯璇 Network novel display method for preventing piracy, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN101814118A (en) Method for protecting web texts based on pictures
KR100878338B1 (en) Watermark encoder and decoder enabled software and devices
US8887290B1 (en) Method and system for content protection for a browser based content viewer
CN102184351B (en) Content reading system and content reading method
US10601600B2 (en) Method and system for sharing content files using a computer system and data network
CN101635622A (en) Method, system and equipment for encrypting and decrypting web page
KR20160110366A (en) Font distribution system and font distribution method
US20130262864A1 (en) Method and system for supporting secure documents
EP3066639B1 (en) Method and device for image processing, and storage medium
CN102819704A (en) Document copyright protection method for intelligent terminal
CN104142923A (en) Method and device for obtaining and sharing partial contents of webpage
US20130031464A1 (en) System and computer-implemented method for incorporating an image into a page of content for transmission over a telecommunications network
JP2008004008A (en) Character content providing method and character content providing system
CN104217136A (en) Method and system for preventing web page text message from being captured automatically
CN102222195A (en) E-book reading method and system
JP6550191B2 (en) Method, apparatus, storage medium and device for forgery prevention based on map revocation data
Fu et al. Text split‐based steganography in OOXML format documents for covert communication
US20190207780A1 (en) Method and system for sharing content files using a computer system and data network
Tiwari et al. A novel methodology for data hiding in PDF files
Rastogi et al. Implementation of digital watermarking technique to secure IPR of web application code
JP2014107626A (en) Information management support system, information management support method, information management support program, personal information management system, and personal information management program
Lai et al. A large payload webpage data embedding method using CSS attributes modification
JP6537729B1 (en) INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, PROGRAM, AND RECORDING MEDIUM
CN117494076A (en) Page data processing method and device, electronic equipment and storage medium
JP2005018220A (en) Disclosure program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Open date: 20100825