Categories
Privacy Website whinging

The problem of stats

No, not statistics in itself. The problem I am writing about is website statistics, and it started a long time ago.

Back in the day we simply used web server logs to analyse website traffic. One could see an incoming IP address and see where the associated browser went in the website. This worked well back then as websites were simple affairs and essentially all one big lump. Of course, this was an era when web servers were run almost in the spare time of those few IT (and indeed non-IT) that had any interest in the web. Back then I was not in the central IT team but I was afforded some latitude for experimenting with new things, especially when redundant hardware could be used. It was 1992 and the IMG tag was still in the realm of fantasy.

Later, there were two open source packages that became very popular, one called Analog and the other Linklint. The former produced statistics about website visitors and the latter could be used to check for errors, missing pages for example. Analog could, when provided with valid data estimate which countries visitors were coming from, very useful when your organisation markets itself globally.

Of course, the marketeers desired more. I was once asked to find out where everyone who only looked at our home page went next. Ok, where they visited another of our own web servers this was do-able, but the question was expanded to ask which of our competitors they visited next. This was new thinking, by which I mean thinking that one could not associate with any other media. For example, if the publisher of one newspaper wanted to know which other newspaper a person took after only glancing at their own it would need some form of physical surveillance, or perhaps a questionnaire. Neither would be particularly reliable, the questionnaire in particular.

Enter, stage left, Google Analytics. I had attended a launch event – well of a sort anyway – where a new product was described which would enable one to search all across the web. The name? Google. We had rudimentary search products by this time but nothing like what was being described. Bells were ringing, but rather quietly. I think we could see back then that all of a sudden content has value, just not to us. But, Google search aside we later got wind of Google Analytics ad the bells got louder amongst those of us who could already see future issues.

Google Analytics arrived with two quite major advantages. First, IT people no longer had to do anything, and second, the marketeers would have access to easy to understand graphs. But those of us who had this nagging voice about global surveillance and the fact that a corporate entity would effectively have access to data indicating where everyone browsed were ignored. Fast forward to the later times of the GDPR and the coming soon and already years late PECR replacement, cookie laws and all that and I resist shouting we told you so but we did and it was back in 1994.

Of course, there was still an issue. Ok, we have this useful global search facility now but how do we include local content which is not accessible from outside? Google again to the rescue. I had a pair of Google Search Appliances (GSA) installed, one in each of our main data centres and fronted by a NetScaler appliance. This provided resilience to the loss of a single GSA. Being on our LAN the GSAs were able to spider content that was restricted to local access and which therefore could not be spidered by Big Google. It also provided a useful facility whereby we could rank, to some extent, content and could apply keyword and key phrase matching to direct searches to specific content which would then appear top in the list of results. This little Google was far more friendly, not being bloated by the desire of the mothership to know all things of all people. Perhaps no surprise then that Google eventually retired the GSA product in favour of a cloud based provision. You guessed it, they wanted to know who was accessing all your secret stuff too.

Are we really where we are because marketing people wanted to know everything about everyone and companies, not just Google cashed in on it? Yes, I think so, and you can see just how far by those invasive adverts that themselves continually leverage new technologies to further invade. Remember pop-ups? And then pop-up blockers? And of course the whole cookie debate where a really quite useful facility enabling shopping carts among other things was hijacked in order to track us across webspace. Yeah, those. Remember the good old doubleclick cookie, adware, ad blockers, layers upon layers of this stuff. It is almost all because of marketing.

Advertising is here to stay and I have absolutely no issue with it. Although I generally ignore it I will admit to having seen something advertised that I was unaware of and which actually filled a need. But there is a constant battle between the marketeers and the techies which will continue because all of this, the Internet, the web, email is designed to help us and  be easy to use and to access. And that’s where it all went wrong but it could not really exist any other way.

Categories
Cookies and tracking Web content Website whinging

Trust

Trust in websites is under attack as has been for some time now.  These days it is really very hard to know what website to trust and which to avoid, which produce valid, trustable news stores and which are fake, even which product reviews are valid and which are misleadingly good and may even have been paid for. Fake websites include those that wish, among other things to deprive you of your hard earned cash, or persuade you that voting ‘x’ is what you must do.

A recent win for Microsoft in a private trademark case highlights part of the issue and I have witnessed similar first hand. It transpired that scammers had passed themselves off as Microsoft or Microsoft partners and used various trademarks owned by Microsoft. This was all related to those well known ‘your computer has a virus’ type phonecalls and pop-up adverts. I have worked on cases regarding academic integrity and websites passing off as our own and so this case is interesting to me. However, it serves to highlight just how easy it is to get someone to trust you by throwing up a website which looks identical to a company that you do trust, or at least you know of.

To make matters worse of there are now so many domain variants available that it is very difficult to fully protect one’s brand. Again, I was very active here in the past and I could, for example buy and activate domains similar to those used by people who created websites to pass off as our own. It was not helped one bit when Nominet decided to sell single-letter domains such as ‘a.uk’ where typo-squatting was then made easy, for example mistyping xyz.ac.uk as xyz.a.uk. Some years ago the Ascension Islands opened up their ‘.ac’ domain, again causing confusion where people would register xyz.ac hoping to trap typo’s from xyz.ac.uk. Just how far one goes buying any domains that come close to your own is a very difficult question and can result in large spends.

Encryption, aimed at promoting trust and security does not really help. While it is laudable that one can obtain digital certificates for free, when coupled with domain squatting this can result in trust being placed where it really should not. 

This is not limited to websites. Whoever thought it a good idea to allow people using IP telephony to put their actual phone number into the system on trust was just daft. You can no longer assume that a call comes from the number shown in the caller-ID, and if someone by chance or design fakes their number to be one already in your contacts lists, well, you can see that going badly for the recipient.

So, where are we? Well, anyone can throw up a website, for free or very little cost. Anyone can grab the design of a valid website and repurpose it as their own scammer base. Anyone can buy just about any domain regardless of how close it is to a real company URL, set up email addresses and either wait for hits or advertise the fake website somehow. And this is without doing anything actually half clever like using malware. And it does not stop there. I worked on a case where a website had a valid-looking address in the City of London. Calls to the building management (on office block with lots of various companies) found no such name on record. In the event I was close to retirement and let this one slide, but I can just imagine some mailroom employee diverting any received post to the scammer. My longest running case took seven years but I finally had a foreign-based fake website closed down after radically disrupting their ‘business’.

To answer my ‘where are we?’ question in part all I can say is it has become very hard to trust any information on the web, and that’s a crying shame. The scammers are like a virus – they are killing their host. How we can stop people becoming a victim I do not know. For myself, I begin by trusting nothing and I use my decades of experience to parse what I see and determine whether or not it is valid. Mobile phone calls from numbers not in my contacts are ignored. URLs in SMS message or emails are NEVER clicked. If I can be bothered to I will investigate – obfuscated URLs, those where someone is attempting to be clever by mixing letters to look like something real, or adding to real-looking domains can be easier to read if pasted into a text-editor. Anything that comes from the bank will also appear in their app and so can be checked.

And don’t get me started on cookies!

Categories
Website whinging

The joys of recaptcha return

I’ve managed to avoid websites that use Google’s daft recaptcha thing until recently. But now PayPal wants it even though I log in and enter the SMS’d 2FA code. Despite clearing all cookies and probably-illegally stored crud it simply will not work on my Mac.

The daft thing is PayPal works fine on my iPhone – well, ok that’s the app version not the website. But I keep cookies disabled on the phone and run a cookie cruncher on the Mac. Even with that turned off recaptcha simply does not work on the Mac and it seems I’m the only one with an issue. Google hates me… but then, I rarely need to interact with it, given I use DuckDuckGo for searches and I have transferred almost all email that used to come to my Gmail account to a more reliable host.

Categories
Data protection Privacy

Proof of existence

In the march to get rid of paper records and have everything online it is becoming increasingly difficult to prove one’s details when signing up to, or dealing with a process still based on old school mechanisms such as requiring bank statements and proof of address. This, plus the fact that in becoming ever more online the World is requiring people to own and know how to use a mobile phone while having little, if any regard to the affordability of such an item. Cursory throw-away lines such as pointing people without online access at home to their public library is becoming increasingly moot with library closures and, not least with Covid19.

Examples of the complexities one may face are rife but here are two real-world examples, carefully crafted so as to not give any names away.

Person A works for organisation B and is changing roles within B. B needs two proofs of ID and two of address from A for the new role. However, A only has one proof of address, a bank statement. B states that a second bank account will do. A can open a bank account with another bank (C) online. C only needs a single proof of ID and a single proof of address, and A’s existing bank statement will suffice for the latter. Therefore, C has a lesser requirement for proof of ID and address than B and will provide a second proof of address to A to send to B. While one may argue that C has too low a burden of proof or that B has one too high one cannot get round the fact that B already has all the information It needs as it is A’s employer.

Another example. A needs government department B to change some details about property C. B will not accept the evidence available to A but government department D does hold valid details about C. B tells A to purchase these from D. Why? Both B and D are government departments. In this case A simply dropped the issue given they had informed B of an error in the records held by B regardless of whether or not B would do anything about it.

In both the above cases the organisation in question (B in each case) has access, directly or otherwise to the information that they require from A. In the first example via existing employment records, and in the second by simply requesting it from another department.

Now, in each case, if A had an official government-scheme ID card, as was proposed and shot to bits several years ago in the UK, B would not require any further information because all such information would be tied into the ID card provided to A. A hypothesis therefore exists that the establishment, governmental, quasi-governmental and commercial, are collectively making processes so hard for all the ‘A’s in the country that a future proposal for all citizens to be issued with ID cards will succeed by the mere fact that people are so fed up with having to find more exotic ways of proving their existence that they will not vote against it.

That cannot be right.

Categories
Cookies and tracking Privacy

Yup, more cookie observations

I have mentioned before that I have all cookies blocked on the phone. It’s a bit of a faff sometimes, I mean if I really need to access a site that requires a login or similar I need to re-enable cookies, do whatever I needed to do, then block cookies again, but it’s no big deal really.

And it is interesting to see what websites do not even need cookies to function, as well as which websites are so badly constructed that they do not even render anything with cookies blocked. Oh yes, and those websites that throw up a cookie banner but which still work once you are past that, of course with no actual cookies having been set.

As an example, I just visited a well known petition website to add my name. It showed the usual cookie warnings which I ignored and managed to sign the petition with no issues at all. I have an email confirmation so it worked just fine.

This brings me back to my question, should any website need to set any cookies before you enter a part that actually requires them to be set? I still say no.

Categories
Cookies and tracking Data protection Website whinging

Crumbling cookies

With the fines and threats imposed by France on Google and Facebook it was interesting to note that both Facebook and, possibly unrelated eBay had logged me out overnight and I had a new-looking consent form presented by Facebook in the browser and eBay in the app. The Facebook app has not changed and I am still logged in.

So I had a look at Google again, specifically google.co.uk. The cookie-wall – I’m calling it that because you need to agree to get past it – looks the same as the last time. Google sets two cookies on entry, one (NID) which my cookie crunching app defines as a tracker, and another called CONSENT with a 2038 expiry date. After a short while it sets another called SNID. More success on the iPhone where I keep cookies blocked. here, as before the cookie-wall appears and then vanishes.

My take on this is to question why Goole is setting these three cookies before I have consented to anything and, if they suggest that their product will not work without then why does it work without? To my simple mind nothing should set any cookies until I agree, and even then the only cookie that should be set if I do not agree is one indicating this so it knows next time. Of course, strictly necessary cookies are excepted, but I would argue that no such cookie is needed until I explicitly request a service for which they are required. This would, or at least surely should never happen on a websites entry page, with the exception of sites that require a login before one can access, and even there surely there will be a not-logged-in page where no cookies are required until one logs in.

Categories
Privacy Security Website whinging

Failed circular verification

So, you need access to a Google doc but when you log in Google senses that the PC has not been used before and is suspicious. It needs verification.

Ok, first off, this is not me. I have access to Google etc. And verification is a great idea. But there is a hole and as yet we’ve not found the bottom.

Verification is all very well provide you can actually do what is required. But what where your verification is your works telephone and you did not enter a mobile number, nor do you want to tell Google your mobile number anyway?

Google has ‘other ways’ to verify you. Following this path it sends you a code to an email address. The only email address in use was the works one. The code came but this is not enough. Google still wants to send a text to a phone – it still wants that mobile number you don’t want to put in. This ends up being circular, with another code being emailed and, once again another request for a mobile.

In the end it was quicker to ask the document owner to simply email it rather than trying to reach the bottom of the hole being dug by Google.

Categories
Cookies and tracking Website whinging

Google, sort-of positive

I know I whinge about Google from time to time but they do give me 15Gb of storage, of which I use a tiny amount and only for Gmail (which is also free of course). Having just received an email about account charges for dormant accounts or those using too much space I thought I would check, and managed to free up an extra 20Mb or so meaning I am using about 300Mb now for Gmail, much of which is me being too lazy to delete emails or pull attachments off onto local storage.

Yes, it does of course mean all those emails are sitting in Google somewhere and can be searched, but these days be honest, if you really don’t want The World to see something don’t put it on the Internet in the first place. Speaking as a privacy advocate and, indeed as a privacy researcher (Ph.D. in Internet privacy, 2017) you do need to take some responsibility for your own privacy. Encrypt important emails and let them scan all the remaining dross, ‘them’ here being all the nameless agencies around the globe rather than Google who, at the end of the day need to make money somehow in order to give us 15Gb of storage for free.

I’ve been in this game for a long time now and I remember Google when it was new. They made such a difference to web searches – anyone remember AltaVista? I ran Google Search Appliances for a number of years too which dramatically improved searches for our corporate websites.

But I will not stop whinging about the whole let’s track everyone across everywhere and see what they are looking for so we can tailor adverts to them… sorry.

Categories
Cookies and tracking

Another Google cookie change

I keep all cookies blocked on the phone unless I actually need to visit a site that uses them for a purpose that I decide is required, e.g. a login function. Even then, after I have finished and while still on whatever website it was I block cookies again and delete all web content (the iPhone option, YMMV).

Not long ago things changed at Google making it impossible to access unless cookies were enabled. I reported this at https://jmh.one/index.php/2020/10/30/google-learns/. Now this seems to have been reversed and once again I find I can visit Google and the cookie warning / acceptance box appears and then vanishes. For a while now I’ve been using Duckduckgo for web searches but the Google cookie-wall-box did prevent me accessing YouTube for a while. So it’s rather handy that the cookie-wall-box has somehow changed back to performing it’s useful vanishing trick.

Of course, this may all be unrelated to Google, or perhaps I am hitting a different node now and there are configuration differences, who knows. But it’s a useful feature/bug nonetheless.

Categories
Website whinging

Website crashes

Once again we see a company spiralling into nonexistence along with the associated sales on their website and the associated flooring of said website by people wanting to buy.

Surely it’s time that website designers actually sorted things like this out? Time and again companies and governments throw up websites which are backend heavy to the extent that once over a certain threshold the backend cannot cope and sulks off into the 500 corner.

While I can accept this – professionally speaking (or as was before I retired anyway!) – a small website crashing under unexpected load, I cannot accept that a website that is actually designed to provide a user experience under load is put together in a way that it falls over. This is the 21st Century and this stuff is not rocket science. It’s not a DDoS attack, it’s actual people wanting to access a website! Governments are not excepted from this criticism – we’ve seen what our own lot manages to do recently. And surely the crashes caused by everyone and their dog jumping at online shopping sites due to COVID should be fresh in the memory of every web designer?

Now, ok, getting real for a tad – yes there is bound to be a limit of what cash can buy where web hosting is concerned and budgeting for ‘what if’ situations can be hard. But look at some of the cloud based services where you can simply let it run wild and pay for the extra horsepower second by second as needed. Give the unpredictability I can almost – I stress almost – accept a queuing system, but what really gets me is that someone implements a queueing system which itself overloads and errors! Good grief…