How do I use Ajax to get data from a website after the site has processed data?

0 favourites
  • 8 posts
From the Asset Store
A set of vector game assets contains ground tiles and several objects, used for creating platformer games
  • Using AJAX I am trying to get some data from a site (a currency site).

    When the site initially loads it just contains the vanilla page source, but (I think) that after a second or so, all that seems to get processed and additional data is added (currency values etc).

    The problem is that when I use AJAX it only returns to data Ajax.Last data of the source code that was present on the initial request. So there are no values to extract, just a bunch of useless <spans> and <divs>.

    I want to extract data that I can see in the 'Inspect Element' window and not the 'View Source'.

    Is there any way to somehow do an Ajax request which delays its return values until the site has done some processing.

    Or some other idea?

    Tagged:

  • From your description I take that you are trying to do some web scraping and your problem is that the page you are trying to scrape has dynamically generated content which doesn't exist upon requesting a page, is that right?

    As far as I am aware this is not possible directly from a browser using AJAX, either through C3's plugin or directly through Javascript using an XMLHttpRequest.

    Don't think you will be able to do this directly from a C3 application.

    Any website that is providing such a service is first making a request to a remote server, then the server uses what is known as a headless browser to navigate to the required website, scrape the data it needs, and finally returns the result to the client.

  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • Just to clarify, headless browsers are libraries that let you write programs that behave like a normal internet browser. There should be at least one headless browser for the most popular server side languages, such as JavaScript (node.js) or Python.

    They have functionality such as waiting for a page to execute JavaScript after the initial request, before trying to query the content of a page, among other things.

    For the most part they are used for website testing, but they can have other uses like web scraping.

  • It sounds like yes, thats what Im trying to do. If I can find out where they get the data from themselves I may be able to get it from the source instead.

  • An easy way to find out the requests a page is doing is by opening dev tools in your browser, usually by pressing F12, and then going into the network tab.

    You will see everything the page is doing, which usually is a lot more than you would think! It might take a while to go through everything but you will surely find it there.

    If you are lucky you will be able to hit the endpoint yourself, unless it is somehow protected by either some sort of pass key or maybe a cross origin access policy.

  • I will dig through thanks.

    I had an idea of loading the page into an iframe and then maybe accessing the data within the iframe after s few seconds. Is that even possible?

  • I found the location but it seems its protected by an Authentication Token. I guess that might be generated off limits. :(

  • I didn't think about the possibility of loading the page in an iframe, but you will have a similar problem, if the page you want to show doesn't like being shown in a different origin, then it won't work.

Jump to:
Active Users
There are 1 visitors browsing this topic (0 users and 1 guests)