Arquivo para a ‘Mineração de dados’ Categoria

The Truths and Myths of Deep and Dark Web

01 Oct

The internet is the physical medium that emerged in the 70’s, and after 1990 it became a layer of code over the internet that makes content data available to users, at the beginning of Web 2.0 it also became available for content production. by users.
The Deep and Dark Web are hidden parts of the Web where only through specific networks one can access them which are i2P, Freedom, Osiris and a few others.
The best known is the Onion network, which precisely resembles the layers of the onion whose interior are layers and layers that do not reach a core, it has a special Router which is the TOR (The Onion Router), which even without specific knowledge can if you use them, but only the deep web, the dark web is harder to access and has many pitfalls.
Recorded Future says it does not represent 1% of the Web, which does not mean that there are no serious crimes and dangers in these digital environments.
These environments where anonymity is important make it a much more delicate task to investigate crime and danger to users, and care must be taken not to damage data, facilitate crime and even damage equipment.
In the count of domains accessible to the Recorded Future site found less than 0.005% were “live” sites, I mean they were used someday, but the content is disabled.
Of the 55,000 Onion network domains found, only 8,400 had something active (15%), so the popular iceberg metaphor (pictured) is not very true, as the visible part is much larger than the submerged one.
Today there are 200 million unique and active surface web domains, while Onion network sites by the number presented, which makes it 0.005% the size of the World Wide Web, according to the study presented.
Thus the benefit caused by the Web is vastly greater than the dark side, which being a crime must be investigated and fought.
The serious cultural, human and social crisis we are going through has a connection with all media and media, but it is not the Web that is a great evil, the great crisis. 


(Português) Percepção, imagens e psicopoder

24 Sep

Sorry, this entry is only available in Brazilian Portuguese.


To leave the twentieth century

26 Feb

When many speak of the education of the twenty-first century, we must know if we leave the twentieth century, thinking, paradigms, the form of education are all frozen in the twentieth century, has little to do with digital age, in his 1981 book, the internet was nascent, Morin already said that the future depended on how to educate for the new century.
He said that with all the “progress” he warned that the danger of war continued to prowl humanity, poverty still persists on a large scale and nature is increasingly shaken by the predatory action of man in this transformation.
He affirmed, contrary to today’s confused thoughts, that it was precisely the means of communication, through which we knew what was happening in various parts of the world, through instantaneous communication, social events on the entire planet.
Of course I warned that there are negative points in this communication, that men know the world only by virtual videos and images (which are not unreal), without being in the world, and therefore the hermeneutic-ontological discourse makes sense, it can create a limited consciousness , but it is also limited those based on ideologies and religions, which use a limited and partial worldview (Weltanschauung).
This being in the world, being absent, is well described by the Polish factionalist Jerzy Kosinski in his modern parable O Vidiota, 1979, on the limit of images, videos and reports.
Edgar Morin clarifies that the news since that time (was 1981), already distorted and hid the real, stating “Information in a totalitarian system is not only a government information; it is, above all, a totalitarian governmental information “, and it is important see any ranking for free journalism.

It is happens not only in the known dictatorships, but in present-day Bulgaria, there are numerous complaints that the press is not free, occupies the 111th rank, where Brazil is at 102, Paraguay at 107 , Bolivia at 110, Colombia at 130, Venezuela at 143, China at 176, a point above Syria, and North Korea at 180.
You can notice in the images that are the darkest regions, the closest to conflicts and wars, therefore, it is not necessary to use the fact that there is subinformation to censor the press, because it is that we can through the denunciation of facts to help society to adjust.
Morin’s book may seem conservative, it has never been, although it has been written for a long time, it can still help us out of the twentieth century and enter into the real problems and situations of the 21st century.
Edgar Morin’s book To Come Out of the Twentieth Century, at first glance, may seem full of reactionary theses. However, I read carefully, it is perceived as a series of serious considerations, based on concrete facts, thus contributing to a truly more progressive and current vision of the contemporary world.
Morin, E. Pour sortir du siècle XX, 1981.

See the Morin´s thiking about actual crisis:


AI will help fight fires

21 Feb

The famous US Department of Defense (DoD), which is also on the internet, has launched a program to use artificial intelligence (AI) for drone data analysis to improve how forest attacks are fought, the news is on Wall Street Journal this week.
The project uses algorithms that evaluate photos and videos, being themselves the ways of forest fuels and improving efforts to contain them, data on resources in helicopters, but also using drones and other cyberspace data on the administrative functions of as part of efforts to improve its efficiency.
A program to monitor and attack forests in California and other parts of the country is one of two pilot projects the Pentagon released on Tuesday.
The program, solved the Pentagon, oriented strategies to establish new innovation strategy of artificial govern, that must to work performance with the academy and the industry for better to advanced information management and data application.
The reason for the fires is clear, also in the Portuguese rhodians, which are a combination of vulnerabilities in the work plans of children can issue fires, such as the serious happened at Camp Fire in 2018 in California, see CNBC vídeo:


We’re getting close, but what?

17 Sep

At age 20, Carl Seagan’s book “Contact” (1985) impressed me in such a way that it never left my

Film Contact, Hormholes and AI detect.

imagination, spoke of wormholes (wormholes are possible paths for the fourth dimension), theology and search for lives on other planets, I made a road to materialism that lasted 20 years, any ilusions.

At 42´s years old, the film Contact (1987) once again impressed me, the protagonist to Ellie Arroway (Jody Foster), in the fiction era of SETI (Search for Extraterrestrial Intelligence), I now discover that the department exists at the University of Berkeley and there they are picking up signals from a star coming from a distant star.

Curious and thought-provoking, it is precisely the phase in which I return to study Teilhard de Chardin’s Noosphere and search the fourth dimension, we are preparing a hologram and an Ode to Christus Hypercubus at Lisbon, just a reference of Salvador Dali’s fourth dimension.

SETI researchers from Berkeley, led by student Gerry Zhang and some collaborators, used machine learning to build a new algorithm for radio signals they identified in a 5-hour period on August 26, 2017 (pull my birthday ), but it should only be a coincidence.

Zhang and his colleagues with the new algorithm have resolved to reanalyze the data for 2017 and found 72 additional explosions, the signals do not seem like communications as we know, but real explosions, and Zhang and his colleagues foresee a new future for the analysis of radio astronomy signals with use of machine learning.

As in the film the signal needed a long time to be decoded, Turing who studied the Enigma machine captured from the German army during World War II, would love to study it today, he deciphered it. The code universe is not therefore a human artifact, space is full of it, not to say it is from any civilization, but they are there, the background radiation for example, discovered in 1978 by Penzias and Wilson, ratified the Big Bang and gave them a Nobel Prize in Physics.

The new results will be published this month in The Astrophycal Journal, and is available on the Breaktrough Listen website.  



Basic Questions of Semantic Web and Ontologies

05 Jul

We are always faced with concepts that seem common sense and are not, is the case of many examples: social networks (confused with the media), fractals (numbers still too generic to be used in everyday life, but important), the artificial intelligence, finally innumerable cases, being able to go to the virtual (it is not the unreal), the ontologies, etc.

These are the cases of Semantic Web and Ontologies, where all simplification leads to an error. Probably so, one of the forerunners of the Semantic Web Tim Hendler, wrote a book Semantic Web for Ontologists modeling (Allemang, Hendler, 2008).

The authors explain in Chapter 3 that when we speak of Semantic Web “of a programming language, we usually refer to the mapping of language syntax to some formalism that expresses the” meaning “of that language.

Now when we speak of ‘semantics’ of natural language, we often refer to something about what it means to understand the utterance – how to go from the structured lyrics or sounds of a language to some kind of meaning behind them.

Perhaps the most primitive part of this notion of semantics is a representation of the connection of a term in a statement to the entity in the world to which the term refers.” (Allemang, Hendler, 2008).

When we talk about things in the world, in the case of the Semantic Web we talk about Resources, as the authors say perhaps this is the most unusual thing for the word resource, and for them a definition language called RDF has been created as a Resource Description Framework, and they on the Web have a basic identification unit called URI, along with a Uniform Resource Identifier.

In the book the authors develop an advanced form of RDF called RDF Plus, which already has many users and developers, to also model ontologies using a language of their own that is OWL, the first application is called SKOS, A Simple Organization of Knowledge, which proposes the organization of concepts such as thesaurus dictionaries, taxonomies and controlled vocabularies in RDF.

Because RDF-Plus is a modeling system that provides considerable support for distributed information and federation of information, it is a model that introduces the use of ontologies in the Semantic Web in a clear and rigorous, though complex, way.

Allemang, D. Hendler, J. Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL, Morgan Kaufmann Publishing, 2008.



Trends in Artificial Intelligence

10 May

By the late 1980s the promises and challenges of artificial intelligence seemed to crumble Hans Moracev’s phrase: “It is easy to make computers display adult-level performance on intelligence tests or play checkers, and it is difficult or impossible to give them the one-year-old’s abilities when it comes to perception and mobility, “in his 1988 book” MInd Children. ”
Also one of the greatest precursors of AI (Artificial Intelligence) Marvin Minsky and co-founder of the Laboratory of Artificial Intelligence, declared in the late 90s: “The history of AI is funny, because the first real deeds were beautiful things, a machine that made demonstrations in logic and did well in the course of calculation. But then, trying to make machines capable of answering questions about simple historical, machine … 1st. year of basic education. Today there is no machine that can achieve this. “(KAKU, 2001, p 131)
Minsky, along with another AI forerunner: Seymor Papert, came to glimpse a theory of The Society of Mind, which sought to explain how what we call intelligence could be a product of the interaction of non-intelligent parts, but the path of AI would be the other, both died in the year 2016 seeing the turn of the AI, without seeing the “society of the mind” emerge.
Thanks to a demand from the nascent Web whose data lacked “meaning,” AI’s work will join the efforts of Web designers to develop the so-called Semantic Web.
There were already devices softbots, or simply bots, software robots that navigated the raw data looking for “to capture some information,” in practice were scripts written for the Web or the Internet, which could now have a nobler function than stealing data.
The idea of ​​intelligent agents has been revived, coming from fragments of code, it has a different function on the Web, that of tracking semi-structured data, storing them in differentiated databases, which are no longer Structured Query Language (SQL) but look for questions within the questions and answers that are made on the Web, then these banks are called No-SQL, and they will also serve as a basis for Big-Data.
The emerging challenge now is to build taxonomies and ontologies with this scattered, semi-structured Web information that is not always responding to a well-crafted questionnaire or logical reasoning within a clear formal construction.
In this context the linked data emerged, the idea of ​​linking data of the resources in the Web, investigating them within the URI (Uniform Resource Identifier) ​​that are the records and location of data in the Web.

The disturbing scenario in the late 1990s had a semantic turn in the 2000’s.

KAKU, M. (2008) The Physics of the Impossible: a scientific exploration of the world of fasers, force fields, teleportation, and time travel. NY: Doubleday.



AI can detect hate speech

17 Oct

It is growing in the social media hate speech, identifying it with a single sourceAnFearEn can be dangerous and biased, because of this, researchers from Finland trained a learning algorithm to identify the discourse of hate by comparing it computationally with what differentiates the text which includes discourse in a system of categorization as “hateful.”
The researchers used the algorithm daily to visualize all the open content that candidates in municipal elections generated on both Facebook and Twitter.
The algorithm was taught using thousands of messages, which were cross-checked to confirm the scientific validity, according to Salla-Maaria Laaksonen of the University of Helsinki: “When categorizing messages, the researcher must take a position on language and context and therefore it is important that several people participate in the interpretation of the didactic material “, for example, make a hateful speech to defend themselves from an odious action.
The algorithm was taught using thousands of messages, which were cross-checked to confirm scientific validity, explains Salla-Maaria: “When categorizing messages, the researcher must take a position on language and context and therefore it is important that several people participate in the interpretation of the didactic material “, otherwise the hatred can be identified only unilaterally.
She says social media services and platforms can identify hate speech if they choose, and thus influence the activities of Internet users. “there is no other way to extend it to the level of individual citizens,” says Laaksonen, that is, they are semi-automatic because they predict human interaction in categorization.
The full article can be read on the website of Aalto University of Helsinki.


The Web 4.0 emerges?

31 Oct

The initial impulse of Tim Berners-Lee to create in the mid-90´s aoutro protocol on the Internet (Web and Internet are different) was to spread more quickly scientific information, then we can say it was a Web-centered information.
The Web quickly became popular, then the growth of concern for the Semantic Web has Berners-Lee, James Hendler and Ora Lassila published the inaugural paper emantic Web: new form of Web content that is meaningful to computers will unleash a revolution of new possibilities further development there was designed as knowledge representation, ontologies, intelligent agents and finally an “evolution of knowledge.”
Web 2.0 had the initial feature interactivity (O’Reilly, 2005) where users become more free to interact in web pages and can tag, comment and share documents found online.
The article pointed the way of ontologies as a way “natural” for the development and add meaning to information in the Semantic Web, with methodologies from the Artificial Intelligence, which in the eyes of James Hendler (Web 3.0) went through a “winter” creative.
But three integrated tools just indicating a new path: ontologies helped build simple organization called knowledge schemes (SKOS – Simple Organization of Knowledge System), a database for consultation, with a language called SPARQL and what was already basic Semantic Web, which was the RDF (Resource description Framework) in its simple descriptive language: XML.
The first major project was the DBpedia, a database proposed by the Free University of Berlin and the University of Leipzig in collaboration with OpenLink Software project in 2007, which was structured around the Wikipedia, using 3.4 billion of concepts to form 2:46 RDF triples (resource, property and value) or more simply subject-predicate-object, indicating a semantic relationship.
There are several types of Intelligent Agents in development, little or no use “intelligence” of Web 3.0, there will be in the future new developments? We pointed out in a recent article Semantic Scholar Tool Paul Allen Foundation, but also the connection to the Web 3.0 (projects related to linked data) is not clear.
2016 definitely has not been the year of the Smart Web, or if you want the Web 4.0, but we are approaching, personal assistants (Siri, Cortana, the “M” of Facebook), home automation (Apple Homekit, Nest), recognition image and driverless cars are right there around the corner.
Home automation is the home smart features, this field AI grows fast.


(Português) Scholar Semantic é uma novidade ?

26 Oct

Sorry, this entry is only available in Brazilian Portuguese.