Topic-icon facebook cannot Scrape my site URL

Active Subscriptions:

None
8 years 2 months ago #38695 by 財榮吳
Hello there,

My site is in multilingual status, JFBconnect didn't properly deal with the Open Graph when submmiting posts to Facebook. And tried to debug my site url and Scraped URL showed nothing.

Here is my site urls:

www.herbalnc.com (locale language)
www.herbalnc.com/tw (alternative language)
www.herbalnc.com/myzh (alternative language)
www.herbalnc.com/vn (alternative language)

none of the above urls could be scraped properly by facebook.

Hope you can help.
Thanks a lot.
The topic has been locked.
Support Specialist
8 years 2 months ago #38722 by alzander
That issue doesn't look to be a JFBConnect, or even a Joomla issue. Facebook simply can't get data from your site. Although a 200 response code is being returned (which is correct), Facebook isn't getting any data, and that's why they're throwing that message.

I've been looking, but couldn't find any reason that Facebook can't fetch your page contents. The best way to tell if this is a JFBConnect issue is to disable the JFBCSystem plugin and test again. That disables almost all of JFBConnect, including the Open Graph tags. It sounds like you already know that it isn't JFBConnect, as you mention other extensions have had this problem.

If Facebook still can't scan your page with that disabled, you'd need to investigate your access logs and see what is being returned when Facebook hits your site. Facebook has a User Agent string of "facebookexternalhit/1.0" or "facebookexternalhit/1.1", which should make finding the request easier.

To investigate the logs, you would likely need to contact your host. You should be able to check the response logs for the response code (it should be 200) and the length of the page which was returned. The length will be critical because Facebook is reporting that "No data could be retrieved from URL". My guess is that, on these requests, your server is returning an empty page, which is why Facebook is complaining.

Some servers have odd configurations that block or change the output when Facebook is detected. That's not the right behavior, and the best guess i have for what's happening on your site.

I hope that helps get you started. I've tested with multiple 3rd party tools to fetch your page, and they all come back with the correct body response. So, we'll need to narrow down if it's your server returning something incorrect. If not, then it means Facebook isn't accepting what you're returning.. which could mean your site or IP address has been blacklisted for some reason. That's more difficult to fix, as you'd need to contact Facebook.

I hope that helps,
Alex
The topic has been locked.
Active Subscriptions:

None
8 years 2 months ago #38728 by 財榮吳
Hello,

I've checked the access log and only the following message:

69.171.237.14 - - [13/Nov/2013:11:23:30 -0600] "GET /tw HTTP/1.1" 200 6080 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
69.171.237.13 - - [13/Nov/2013:11:23:48 -0600] "GET /tw HTTP/1.1" 200 6080 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"


any idea?

BTW, I've already contacted my host admin to give me some tips.

Thanks a lot.
Tyrone
The topic has been locked.
Support Specialist
8 years 2 months ago #38736 by alzander
Tyrone,
I just did a quick test and pulled your HTML down to a different server and used the Facebook Debug tool. It *seemed* to work, but there were other issues.. like the domain not being allowed for your App ID. So, it's not a conclusive test, but something that says (at least) it's likely not the HTML on your page that's causing a problem (parsing error, etc).

With that log file in hand above, can you check what a request to the /tw page normally returns? Specifically, the 6080 number in the line above is the response return size. That's 6080 bytes being returned. On my server, the same HTML returns 20012. 6080 could be valid due to character encoding or GZip compression.. or, it could be a sign that something different is being returned to Facebook. Let me know if other requests are 6080, or a different size, as that will really narrow things down further.

Thanks for your patience. We'll keep plugging away at what could be wrong.

Thanks,
Alex
The topic has been locked.
Active Subscriptions:

None
8 years 2 months ago #38744 by 財榮吳
I've created a new facebook application of my site, but after I submitted it to JFBconnect with AutoTune, an error on "Facebook App":
Facebook API Error: (#100) canvas_url URL is not properly formatted

How can I do?

Thanks.
The topic has been locked.
Active Subscriptions:

None
8 years 2 months ago #38746 by 財榮吳
Hello Alex,

There's something different! After I disable Gzip in my joomla, the facebook seem scrape something of my site with the following error:
Object at URL 'www.herbalnc.com/tw' is invalid because the configured 'og:type' of 'herbalnc_global:' is invalid.

And click the "Scrape URL" there's something fetched! Previously there's only blank view of Scrape URL.

Maybe if I solved the error and then Open Graph will work.

Hope you can give me some tips for the error.

Thanks a lot.
The topic has been locked.
Active Subscriptions:

None
8 years 2 months ago #38747 by 財榮吳
Hello Alex,

The Object problem is caused by my custom setting in JFBconnect, after I unpublished the object setting the problem is gone.

After all, the problem seem to caused by Gzip! After I disabled Gzip everything seems works fine!

I'll test further and if any problem I'll let you know. Very thank you a lot to give me "Gzip" tip. ^^
The topic has been locked.
Active Subscriptions:

None
8 years 2 months ago #38748 by 財榮吳
Only a litte problem, if the url is the site's locale language url, such as:
www.herbalnc.com
www.herbalnc.com/tw

There is still an issue:
og:url tag in the header is not the same URL as rel='canonical' link in the html.


But if go deeper to subdirectory everything is fine, such as:
www.herbalnc.com/tw/wlc
www.herbalnc.com/myzh/fasobook

What's the problem in my setting? Thanks a lot.
The topic has been locked.
Support Specialist
8 years 2 months ago #38749 by alzander
Fantastic! Glad we're making some progress here!

We'll have to look into the og:url and canonical issue that you're having. Can you let me know what you're using to setup the multi-lingual features on your site? Is it a 3rd party extension, or Joomla's built-in multi-lingual support?

As for the Autotune error, there's an issue with Facebook that we're looking into fixing. We should have an update out soon, and sorry for the issue.

Thanks,
Alex
The topic has been locked.
Active Subscriptions:

None
8 years 2 months ago #38754 by 財榮吳
Really very thank you for your great support!
I'm using J3.2 built-in multilingual function.
The topic has been locked.
Social Stream
Hide. Seek. Play!

Countdown Hide & Seek Game

Countdown, the new hide & seek toy, is available now! Designed and developed by the founders of SourceCoast, it's the perfect toy to get your kids moving at all ages.

Learn More About the Hide and Seek Toy