visitor (0 QPoints)
  • FR
  • EN
  • NL
  • DE
  • ES
315 experts, 1193 registered users, 1659 questions already answered
European Experts Exchange, the very best site for high-quality IT solutions

New Improved Search!

 


05/10/2011 1h30 : Steve Jobs is dead, the father of Apple ][ is gone, we are all orphaned.

Languages :: Java :: how to copy a full java site?


By: hamood Canada  Date: 15/07/2007 03:06:30  English English English  Points: 20 Status: Answered
Quality : Excellent
I\\\'d like to know how can i copy a full java website including all its pages/images/audio/video.
i did try several programs such as Website downloader, surfoffline, grab-a-site 5.0
the only problem with these programs it only downloads the main url\\\'s in the site it cannot access \\\"next page\\\".
to be more precise this is an example:
the website has a domainname.com/home.asp in the homepage there are 100 pages that you can access by pressing next or by choosing the page number on the bottom of the homepage. but what these programs cant do is they cant really access these 100 pages but only the URL\\\'s.
To go to the second page thats the link given javascript:viewPage(2)

any help of what program i can use is really more than welcome.
thank you and if you have further questions please dont hesitate to ask.
thank you
By: VGR Date: 15/07/2007 09:53:21 English  Type : Comment
well, it's probably designed that way specifically to hamper you ;-)

people usually don't want their sites to be copied that way.

using java (or Flash) is a good start ; using intelligent and obscure javascript on links is an other good step ;


could I have the site's URL, to have a look at what could be done ?
By: hamood Date: 16/07/2007 18:57:18 English  Type : Comment
well the site owners gave me permission to copy the products from the website, but they've got thousands of items, and its unrealistic to copy one at a time.
if you know of a way or a program that i can use to copy the products section that is realistically reliable ill be very greatful , as for there design i have no interest in copying it.
the website is: http://www.oceanwholesale.com/main.asp

thank you


By: VGR Date: 17/07/2007 06:49:48 English  Type : Comment
if you've authorization, the simpliest way is you to obtain from them a (emporary?) access via FTP or SFTP, ***or*** them to give you a ZIP/GZ/TAR/whatever Archive containing the files ...

also, given it's the products you want, I'm sure thgey can give you the tables from the database along with the images stored for the products.

This said,
1) I can't login ;-)
2) it's a flash website, not a java one ;-)

tell me if I can help further
By: hamood Date: 18/07/2007 04:25:21 English  Type : Comment
as for them sending me an access to their ftp, they didn't wanna do that its cuz i am only a seller of there items, i dunno y there making it hard, but they said i can take there pics, and list there items, and thats wat i wanna do, to have a catalog and show my customers, but its more than 5000 items, and am sure there;s a way into downloading there items into a catalog style, even if i had to copy it 1 page at a time, atleast its not as bad as 1 item at a time. as for the main page it is flash, but when u log in there is jave in the items,

i did a new account for you to checkout the website if you dont mind its:

user name: euroexpert
pass: asdfasdf

whats the easiest way to copying the items? and is there any programs to help.

thank you
By: VGR Date: 19/07/2007 07:25:55 English  Type : Comment
i will check later today.
By: hamood Date: 23/07/2007 00:16:45 English  Type : Comment
hey there,
i was wondering if you had a chance to take a look at the website ? i would realy apreciate if u do.

user name : euroexpert
password : asdfasdf

thank you
By: VGR Date: 23/07/2007 07:28:02 English  Type : Comment
:D

I call upon your indulgence : I've three offsprings to care about ;-)

No, I didn't take the time to check yet. But will do. Promissed. ASAP.
By: VGR Date: 26/07/2007 14:16:07 English  Type : Answer
ok, your problem has nothing to do with java or flash. It's purely a "site aspirator" problem. You want to siphoon (correct?) the web site.

I recommend to use some program to first get the main page's references (left menu, like in Garments : http://www.oceanwholesale.com/sortslist.asp?sortsid=55
Then to repeatedly get those URI in sequence and retrieve the associated images (for instance, the link http://www.oceanwholesale.com/product_detail.asp?Id=24827 is associated with the image http://www.oceanwholesale.com/product_images/uploadpic/200772618144846618.jpg)

I say this because I found no logic in the images numbering, or else I would have recommended to directly extract images from the images directory, ie http://www.oceanwholesale.com/product_images/uploadpic/ ; too bad really that the images are not named from the item reference... (H4173 in this case). We would just have had to get all the references (less than one hundred HTTP calls) and then to directly try to extract associated images, stopping when failing to continue the sequence).

This would have been faster.

I wrote a lot of polling/data extraction robots like this. the problem is that they're very sensitive to changes in the layout of the target site.

If you want to copy all the references once, then it's good (trashware IMHO) ; if you want to keep your data updated/synchronized with the target site's, then you'll need to use some hours here and there fixing broken stuff. Not too difficult either.

I could even do this for you for a fair compensation (like some items from my wishlist on amazon ;-)
By: VGR Date: 04/10/2007 19:20:40 English  Type : Comment
any news or feedback on this problem ? Should it be closed ? Please do ;-)
By: OpConsole Date: 01/11/2007 16:12:13 English  Type : Comment
Dear,

If you found some of the above comments to have proved helpful in solving your issue, you shall Accept the Answer or sPlit points between the various useful comments. Each one can receive a quality evaluation from + (somewhat helpful) to +++ (working solution).

Given this Question has been Open for quite a while now, please accordingly "accept an Answer" ASAP

This Question will be randomely force-closed in one month from now.

Thanks and regards.

Admin.

Do register to be able to answer

EContact
browser fav
page generated in 116.694930 milliseconds

Why Google AdSense ads ?

compteur
 Ranking-Hits PageRank for this page