XML and MS-DOS copy command

A bad combination. Why to use it? Because you need that XML. What can happens? Ugly thinks. Like what was happening to me: after executing this command in my XML there was a new character (that SUB), visible only in special editors:

copy command xml files error

The result is an invalid XML file. Untill now it was ok, this batch was working coreclty so it is posible that my computer has received some viruses or something goes wrong with my Windows. I didn't find the answer!

The solution is ... to not use copy; there is xcopy ms-dos command or robocopy [I didn't used it yet].


Posted by: admin
Posted on: 2/6/2010 at 3:48 AM
Tags: , , , , ,
Categories: Code | Programing | security | Windows
Actions: E-mail | Kick it! | DZone it! | del.icio.us
Post Information: Permalink | Comments (0) | Post RSSRSS comment feed

Building a simple crawler - indexing the internet starting from one page

Is this posible...? the index the internet starting from one page....? Yes, I have tested this: starting from iuliumaniu.ro, a site that was builded for a Christian comunity, by me a long time ago (when I create such a program for college diploma). This consist in some small steps which followed can collect the entire Internet content, or what you need from there.

Steps for crawling the internet:

1. set u = starting url

2. load u

3. [?store data about page u]

4. process page u - extract links from u content

5. foreach u = extracted link go yo step 2.

1st step is simple - you have to select a page/site where exists some external links, to walk and on other sites. 2nd step means that you have to get that page content ussualy using a http web request; 3rd step can be placed before or after step 4, depending about what do you need to collect (if you need to collect and the links, or you have to process the stored data, probably this step will be after step 4); it consists in some data storage (database, xml, ...) implementation. Processing the page content be managed in more maniers, I can give to you 2 simple ways to process this - XML/HTML or process as a text, eventually using the regular expressions - XML is more harder to implement but this can give to you some advantages. And in the end you follow all page urls and jump to step 2 - this will ensure that the internet will be indexed entirely by you application.

This is a small theory about crawlers, it is not very dificult to implement it. Come back soon for a small implemenation of this.


Posted by: admin
Posted on: 1/10/2010 at 5:06 PM
Tags: , , , ,
Categories: Articles | Crawlers | Google | Programing | SEO | WWW
Actions: E-mail | Kick it! | DZone it! | del.icio.us
Post Information: Permalink | Comments (0) | Post RSSRSS comment feed

How to insert into a table, in the collumn specified as identity

Do you need something like this? I need this many times, specially when I need to syncronize two databases or life database with demo database. So..this is simple, you have to write a common insert command and other two lines, for ignoring, on insert time the identity column:

SET IDENTITY_INSERT [table_name] ON

insert into [table_name] (id, name, ...) values (99,'popnadrian')

SET IDENTITY_INSERT [table_name] OFF

And you made a "nice" work, your boss must be proud by you Wink


Posted by: admin
Posted on: 11/6/2009 at 3:40 PM
Tags: , ,
Categories: Programing | SQL
Actions: E-mail | Kick it! | DZone it! | del.icio.us
Post Information: Permalink | Comments (0) | Post RSSRSS comment feed