social, health, political imagery through the lens of George J Huba PhD © 2012-2019

Posts tagged internet

Yesterday I worked on my post about John Tukey and his contributions to statistics, data analysis, and my cell phone addiction.

As I did research to supplement my personal knowledge about Dr Turkey— near the end of his life, a good friend did work with him and one of my grad school professors (Bob Abelson) was one of his most influential students — I noticed the brevity of the bio in Wikipedia about him (less than a half a window on my computer) and contrasted this to the large number of screens of information available on the Kardashians, Justin Bieber, Rodrigo Borgia, Al Capone, and Richard Nixon. Even R2D2 has a much longer biographical entry.

screen_0091 screen_0092

At many times the Internet is like ancient Rome (bread and circuses) or an episode of (un)reality television.

I dread to think how the aliens in the next galaxy are going to react when the television waves hit their planets. The two likely responses I forecast will be to either classify humans as a lower life form or to be delighted they have all the episodes of the Kardashians. I am betting on the latter (or probably both).

It makes me sad.


PART 1 discussed my view that a world wide memory is available to supplement an aging (and especially cognitively impaired) person’s biological personal memory (a.k.a. the brain).

Seems obvious, but is it?

I contend that even though Google and the huge information database contained on the Internet have been around for a while, it is only just now starting to be understood that this information can be “mined” and reorganized for individuals.

It’s not just about Facebook  either although Facebook is an important part of it. As are all of the other social networks, the stuff for sale on the Internet, the old stuff on your computer, and the old stuff on the computers of your extended family.

It’s all about visualization, visual information processing, and rearranging that visual information for the individual. Like your Uncle Fred who is “losing it” or your Mom who has lost it or yourself. Or leaving behind visualizations for your kids and grandkids or your spouse (who even after decades will not know how you view all of the things that shaped you and are important).

In the spirit of visualization, lets go to a mind map for explaining visual thinking.

Please click to expand.


Or same map, slightly different format …


Irv Oii is known to many international news organizations and researchers as a star data journalist. Being a home worker (although home may be the UK, Ohio, the Middle East, Central Africa, Hong Kong, or Antartica) and a fairly reclusive person, nobody seems to have met Irv. Some speculate that he might be a Jewish Asian-American. Others believe Irv is short for Irvelina, a Russian immigrant physician who went to Ohio (or was it Ojai, California) when the Soviet science programs collapsed and turned into the lower funded Russian collaborative efforts with the EU and USA. The collapse of the Soviet Union resulted in the closing of her laboratory in Minsk. Some even think Irv Oii is an acronym.

Irv is thus an enigma and no pictures of her/him seem to exist. An artist’s conception (mine) based on the writings and consultations of Irv Oii on healthcare breakthroughs is shown below. My belief is that a portrait of Irv should hang over the desk of every data journalist and researcher.

Please click the image to zoom.

Irv Oii

I confess. In 1979 Pete Bentler and I published an article entitled “Simple Minitheories of Love” in the highest prestige journal on personality and social psychology.

Blame it on the exploits of the greatest psychometrician of his generation and a 28 year-old wanna-be psychometrician, both active personality researchers, trying to convince the field that the new statistical modeling methods (Structural Equation Models; LISREL) they were testing would revolutionize the field (I was wrong on that one, too).

Now ask yourself why neither of these guys — nor any of the other main figures in the fields of psychometrics, sociometrics, personality, social psychology, attraction research — ever went on to start a web site to match individuals on the basis of personality and life style questionnaires (I won’t dignify them by calling them tests); such sites became quite lucrative. This was in spite of the fact that at least one (Huba) had the opportunity to do so during the years when he was the Vice President of R&D for a major psychological testing company and later when most of the other competing testing companies hired him as consultant. Or why did the major personality test developer of his generation and the owner of a psychological testing company (the late Doug Jackson) never consider developing such a product?

See a pattern here? Even the folks who made the most $$$ from psychological instruments and had the most influence in the psychological assessment journals and industry did not develop a Love Site.

I concede that a Love Site may be a good place to find people you might not never meet otherwise through your social and work friends and these might be good mates or sex partners. Or they might be psychopaths, perpetuators of sexual or domestic violence, dependent individuals, or alcoholics.

So far as I can tell from the undisclosed algorithms of the dating sites and their unpublished outcomes, I have no way of knowing for sure if the sites have a good chance of producing a good outcome and avoiding a terrible (and life-threatening) one. I suspect that if there were strong scientific evidence that the sites “work” in both cases, there would be a lot of scientific research published that supports this notion. Where is the incontrovertible evidence? Can I can read it or hear it at professional conventions? Claims on TV that a lot of people got married mean little or nothing without information about comparison groups or negative outcomes.

I would have no problem concluding that the Love Sites are effective if there were psychometric and other scientific evidence that the algorithms used are valid. Without such evidence, I worry that they are more voodoo and “smoke and mirrors” than places where you can find a mate and your date will not result in a rape. Of course I cannot prove my position is right, but neither can the Love Sites. My stance is safer for individuals.

There is that old fashioned system of “meet and greet and respect the people you meet” that did produce so many humans that we now have a problem with world-wide population growth. Sometimes older methods work better if you are patient.

Love Sites

Aaahhh… GiGo (garbage in/garbage out). The GiGo phenomenon haunts data analysts, statisticians, researchers, theorists, and someone who loses their identity.

So these huge [health] datasets we keep hearing about … who controls them? what is their validity? reliability? utility? who else gets to see them?

And the data mining algorithms… proprietary or public? based on which tests and algorithms? who developed? who validated? are the methods valid? reliable? have utility?

And the results coming out of big data and proprietary data mining algorithms… reliable? valid? useful? clearly interpreted? limitations stated? misinterpreted?

Is big data and data mining about using world-wide data to find solutions to some of the world’s problems or to sell more books, videos, and cola?

I don’t think anyone really understands the big data sets and their limitations. I doubt that more than a small percentage of the data mining algorithms are valid. I sure as hell do not want somebody blindly using these algorithms on data they do not understand and then helping the government limit healthcare visits for high need, low resource individuals (sound familiar to anyone?).

An experienced statistician-data analyst-methodologist knows that when analyzing a large data set you must spend 98% of your time looking at (and fixing if possible) bad data points. The final 2% of your work is then much more likely to show something that is reliable, valid, and useful.

Big Data may save us, or it might kill us first. Or it might make us Borg or batteries.

Right now the analysts are reticulating splines.

No mo …. GiGo. [Is Nicki Minaj available to record this mantra?]


In the USA, you do not have the right to yell fire in crowded theaters. And you shouldn’t. There are limits to free speech and the invasion of privacy. Somebody needs to tell Emperor Trump that.

It is now possible to data mine public information, the Internet, photos of my home from outer space, credit card records, records of what I read (from book purchases and Internet clicks), records of what I watch (from movie ticket purchases and Internet clicks and cable/satellite clicks), my family tree (and what they read, watch, buy, etc), my health insurers’ reimbursements, my pharmaceutical purchases, and once again what I charge to my credit card. Somebody knows that I regularly go to a Thai restaurant in Chapel Hill and the Panera in Durham (I never use a credit card at McDonald’s so no one will know I like steak and egg bagels) and get my prescriptions filled in Chapel Hill. In many real ways it is now possible for data miners to create models that are almost me by adding in routines that account for the fact that I am predictably unpredictable. And even more scary, they can use a computer model of me to infer things about my family (genetically linked conditions, personality proclivities, intelligence, potential problem behaviors, lifespan). Really sucks, doesn’t it.

WTF have we been thinking? If there was ever an industry in need of regulation it is the folks who are recreating ME as a computer model. Don’t feel left-out… they are also recreating YOU.

I predict that a new industry will arise to “fool” the data miners and make the computer models less accurate by adding random noise and random data to the information the marketers use. This can be done by a variety of techniques probably well-known to the CIA, FBI, NSA, Amazon, Google, health insurers, and Walmart. I currently try to manually introduce random information into my various computer profiles.

You do not have the right to be ME. In person or on the Internet. For any purpose.

To the data miners and marketers who are stealing ME and YOU and YOU and the next two generations of our families I say … “I wonder if there is a special corner in Hell for you.”

This slideshow requires JavaScript.

The penalty for yelling fire in a crowded theater might well be fire.

PS. For those of you who do not like Obamcacare and universally covering pre-existing conditions, remember that current computer models are sophisticated enough to make a good estimate of the odds your own unborn great grandchildren will have certain serious medical conditions and behavior problems. Hell, we better cover pre-existing conditions for the next six generations before somebody decides to pre-disqualify my unborn grandchildren.