More on Rough Sets and Species Identity, With Application to the sorta Operator

According to Zdzisław Pawlak (the inventor of the mathematical concept of rough sets, and from whose writings I will be quoting liberally below from an online introductory book he has posted here), a rough set, which is a formal approximation of an ordinary or crisp set, can be defined mathematically via a combination of two ordinary sets R* and R* which define the approximation of the rough set X within the universal domain set U. He defines the rough set as follows:

The rough set concept can be defined quite generally by means of topological operations, interior and closure, called approximations. Let us describe the problem more precisely. Suppose we are given a set of objects U called the universe and an indiscernibility relation R ⊆ U × U, representing our lack of knowledge about elements of U. For the sake of simplicity we assume that R is an equivalence relation. Let X be a subset of U. We want to characterize the set X with respect to R. To this end we will need the basic concepts of rough set theory given below.
  • The lower approximation of a set X with respect to R is the set of all objects, which can be for certain classified as X with respect to R (are certainly X with respect to R).
  • The upper approximation of a set X with respect to R is the set of all objects which can be possibly classified as X with respect to R (are possibly X in view of R).
  • The boundary region of a set X with respect to R is the set of all objects, which can be classified neither as X nor as not-X with respect to R.
  • Now we are ready to give the definition of rough sets.
  • Set X is crisp (exact with respect to R), if the boundary set is empty.
  • Set X is rough (inexact with respect to R), if the boundary set is nonempty.
  • Thus a set is rough (imprecise) if it has nonempty boundary region; otherwise the set is crisp (precise). This is exactly the idea of vagueness proposed by Frege.

    Pawlak goes on to define useful regions of the rough set X as:

    Formal definitions of approximations and the boundary region are as follows:
  • R-lower approximation of X [is] R*(x) = ⋃x ∈ { U R(x): R(x) ∈ X }
  • R-upper approximation of X [is] R*(x) = ⋃x ∈ { U R(x): R(x) ∩ X ≠ ∅ }
  • R-boundary region of X [is] RNR X = R*(X) - R*(X)
  • and later he notes several properties of the sets R, R, and X in U, such as that

  • R*(-X) = - R*(X)
  • R*(-X) = - R*(X)
  • Which are notable because they allow rough sets to avoid Gareth Evans' challenge to rough sets as I previously blogged about here, since the complement of possible identity is not therefore definite identity.

    Furthermore, there is a definiton of accuracy with regard to a rough set, if B is a criterion for defining the rough set:

    [A] Rough set can be also characterized numerically by the following coefficient
    αB(X) = | B*(X) | / | B*(X) |
    called accuracy of approximation, where |X| denotes the cardinality of X. Obviously 0 ≤αB(X) ≤ 1. If αB(X) = 1, X is crisp with respect to B (X is precise with respect to B), and otherwise, if αB(X) < 1, X is rough with respect to B (X is vague with respect to B).
    Finally, I note that Wikipedia says that:

    Clearly, when the upper and lower approximations are equal (i.e., boundary region empty), then αB(X) = 1, and the approximation is perfect; at the other extreme, whenever the lower approximation [ B*(X) ] is empty, the accuracy is zero (regardless of the size of the upper approximation).


    Anyway, I've been intrigued by how the idea of species as a rough set can be used to analyze the uses of the species category concept in other domains, like consciousness studies, where there may be discussion of the possibility of a truly intelligent and conscious computer. Rough sets of the type used to define Darwin's species concept are defined by paired sets which have well defined inner "definitely is " and outer "definitely is not" boundaries defining an uncertain zone for the rough set.

    As applied to organisms in biology, we would say that any given individual organism can certainly be classified within some broad range, but it might be only approximately classified as to species, since its measured characteristics might place it in the vague boundary of a species instead of well within or without a given species definition.

    For example, let's look at the coyote and dog species. Dogs (Canis lupus familiaris) and coyotes (Canis latrans) are separate canine species. How do they differ, and is this difference one where vagueness applies? Let's look at the difference between the species:

    Dogs are of the species Canis lupus familiaris. A dog is a mammal, a ground-dwelling quadruped carnivore-omnivore with prominent canine teeth, which places it in the Canis genus of which it is the taxonomic archetype. Dog breeding has created the largest diversity of types of dogs seen within any mammalian species. In general, dogs have deep chests, white nail beds, pale tail tips, and have elbow joints placed above the height of the sternum. The ears of a dog may be upright or droop, but tend to be thin-skinned. Dogs usually run with their tail up or level.

    Coyotes are Canis latrans. Coyotes are ground-dwelling quadruped omnivores with prominent canine teeth, also placing them in the Canis genus. Coyotes have narrowed skulls and less muscled jaws than dogs or wolves, and their elbows are below their chests, since they have shallower chests and lungs than dogs. Coyotes have long, thick upright ears, dark tail tips, and dark nail beds.

    What characteristics, then, do we have for species classification (not counting those based on DNA sequencing)? Here is a chart:

    CharacteristicDogCoyoteAbbrev.
    legs44L4+
    dietomniomniDo+
    teethprominent caninesprominent caninesTc+
    elbow positionabove sternumbelow sternumAd, Ac
    earsthin, variably floppythick, upright, longEd, Ec
    tail tiplightdarkTd, Tc
    nail bedslightdarkNd, Nc
    Note that there are very pale coyotes with pale tail and nails, and that there are dogs with dark nail beds and dark tails. German shepherds arguably have thick upright ears, and greyhounds have elbows below the sternum, so, looking at a large group of dogs, we might get the following:
    CharacteristicsCount
    L4+ Do+ Tc+ Ad Ed Td Nd74884
    L4+ Do+ Tc+ Ac Ed Td Nd36
    L4+ Do+ Tc+ Ad Ec Td Nd950
    L4+ Do+ Tc+ Ad Ed Tc Nd21
    L4+ Do+ Tc+ Ad Ed Td Nc15
    L4+ Do+ Tc+ Ad Ed Tc Nc319
    L4+ Do+ Tc+ Ad Ec Tc Nd11
    So, the accuracy of the criteria above would be (74884) / (36+950+21+15+319+11+74884) = 0.982, or 98% accuracy for the rough set's criteria in classifying these dogs as Canis lupus familiaris.

    An important point to make here is that dogs and coyotes are close relatives, since they are both canids (that is, they are of genus Canis) and also have been rarely known to interbreed. A canid that seems to be not quite a dog, since it is a little like a coyote, is still in almost all respects like other canids which we are sure are dogs. We'll return to this fact later.

    In several past writings in the last few years, including the recent book Intuition Pumps and Other Tools for Thinking (2014), philosopher Daniel Dennett has described a sorta operator, which he describes by example to be an artificial intelligence analog as

    What we might call the sorta operator is, in cognitive science, the parallel of Darwin's gradualism in evolutionary processes. Before there were bacteria there were sorta bacteria, and before there were mammals there were sorta mammals and before there were dogs there were sorta dogs, and so forth. We need Darwin's gradualism to explain the huge difference between an ape and an apple, and we need Turing's gradualism to explain the huge difference between a humanoid robot and hand calculator. (p. 96)

    The problem with talking about apes and apples with regard to the sorta operator, however, is that we don't consider an ape to be sorta an apple, though we might consider a coyote to be sorta a dog! Evolution, as a theory of the origins of life or the major categories of organisms NEVER says that a multicellular animal is gradually a multicellular plant. Rather, it says that unicellular organisms that were sorta unicellular plants and sorta unicellular animals diverged at some point. It's probable that during that divergence the proto-plant cells were vaguely like the proto-animal cell. Because some aspects of evolutionary specialization appear to be one-way in their effects, there is no sorta path from apple to ape! So, the analogy of a sorta operator from apple to ape fails here, and I think that Dennett's inadequate grasp of paleobiology is sorta leading him to make a category error of some kind. Ironically, since it is the historical, empirical course of evolutionary history that leads to its contradiction of Dennett's gradualism, if the creationists were right and current understanding of evolutionary history were wrong, Dennett's ideas here might be closer to validity.

    Intuition Pumps and Other Tools for Thinking's analogy between species vagueness, taken in this blog as a rough set, and machine/human intelligence fails to work well in at least one more respect. Consider our working example of a intelligent and conscious object, a human and the human brain:

    Now consider computers and computer processors:

    Do we see a gradual merging in classification of these two kinds of objects, especially looking at consciousness? No. As we move away from conscious with humans, we move either toward non-conscious humans, such as persons asleep or in coma, or in species we move toward apes and monkeys. Not towards silicon machines! On the computer side, we move upward from simple calculators to more sophisticated non-conscious, non-intentional AI, but we never have any examples of truly conscious AI. Dennett admits as much, since he says Turing's gradualism is between "humanoid robot and hand calculator." Unfortunately, we have no examples of a conscious, intelligent, humanoid robot. So, in rough set terms, for computing machines that are conscious, | R* | = 0. And thus, by the accuracy measure above, the accuracy of Dennett's sorta operator in classifying AI as intelligent is zero.

    To put this another way, rough set theory tells us that we cannot say any form of computer AI or other computing simulation of consciousness or intelligence is sorta conscious until we have an example that we can classify as definitely conscious. Once we have that, we can find a sorta neighborhood of that truly intelligent AI, just as we can define a sorta conscious region around a set of normal waking humans. But we have no empirical evidence for any sorta neighborhood that contains an existing conscious machine AI.

    Since we have no examples of truly conscious AI, we must take humans as composing our R*, and place current computer models of AI in a region (within our universe of objects) of things that are definitely not conscious, in U outside of R*. The accuracy of a sorta operator in identifying true conscious machine intelligence is zero, at least until we actually have a machine consciousness to give us an example of what that machine sorta would be.

    So, while it might make a good intuition pump for the imagination, as a way of pointing to the possibility of conscious computers it looks like the sorta operator is sorta wrong.

    No comments:

    Post a Comment

    Risks for impaired post-stroke cognitive function

    In a printed posted to the medRxiv preprint archive this month, I found a chart review of patients with stroke to determine factors (other t...