Packages

  • package root
    Definition Classes
    root
  • package ch
    Definition Classes
    root
  • package usi
    Definition Classes
    ch
  • package inf
    Definition Classes
    usi
  • package reveal
    Definition Classes
    inf
  • package parsing
    Definition Classes
    reveal
  • package artifact

    Provides classes for dealing with Stack Overflow artifacts.

    Provides classes for dealing with Stack Overflow artifacts.

    It also provides means to serialize and deserialize them from JSON files.

    Tutorial

    The main class to consider is ch.usi.inf.reveal.parsing.artifact.StackOverflowArtifact, representing a Stack Overflow discussion.

    The main way to obtain a Stack Overflow discussion is to deserialize it froma JSON file, using ArtifactSerializer.

    import ch.usi.inf.reveal.parsing.artifact._
    val jsonFilePath = /*path to json file*/
    val serializer = new ArtifactSerializer()
    val discussion = serializer.deserializeFromFile(jsonFilePath)

    You can now inspect the discussion, for example by getting the question (a StackOverflowQuestion object) or answers (a seq of StackOverflowAnswer):

    scala> val question = discussion.question
    question: ch.usi.inf.reveal.parsing.artifact.StackOverflowQuestion = ...
    
    scala> val answers = discussion.answers
    answers: Seq[ch.usi.inf.reveal.parsing.artifact.StackOverflowAnswer] = ...

    The way to get the object representing the poster of the question (a StackOverflowUser) is the following:

    scala> val posterUser = question.owner
    posterUser: Option[ch.usi.inf.reveal.parsing.artifact.StackOverflowUser] = Some(StackOverflowUser(...))

    You can also easily get the comments (objects of StackOverflowComment), for example the ones posted for the question:

    scala> val questionComments = question.comments
    questionComments: Seq[ch.usi.inf.reveal.parsing.artifact.StackOverflowComment] = ...

    The easy way to get all information units (ch.usi.inf.reveal.parsing.units.InformationUnit) and the HASTs (classes implementing the ch.usi.inf.reveal.parsing.model.HASTNode trait) for an element (question or answer or comment) is:

    scala> val allUnits = discussion.units
    allUnits: Seq[ch.usi.inf.reveal.parsing.units.InformationUnit] = List(...)
    
    scala> val astNodes = allUnits.map { _.astNode }
    astNodes: Seq[ch.usi.inf.reveal.parsing.model.ASTNode] =

    For further information about the information units, the HAST and the model, please check the documentation of ch.usi.inf.reveal.parsing.model and ch.usi.inf.reveal.parsing.units.

    Definition Classes
    parsing
  • Artifact
  • ArtifactSerializer
  • DateSerializer
  • SourceInfo
  • StackOverflowAnswer
  • StackOverflowArtifact
  • StackOverflowComment
  • StackOverflowElement
  • StackOverflowPost
  • StackOverflowQuestion
  • StackOverflowUser
  • XmlSourceInfo
  • package model

    Provides base classes to model HAST nodes for Stack Overflow artifacts.

    Provides base classes to model HAST nodes for Stack Overflow artifacts.

    Overview

    Every note of the HAST implements the HASTNode trait.

    Depending on the specific fragment, nodes implement traits in different packages. For example:

    Special nodes are used for various purposes, and are contained in this package:

    Definition Classes
    parsing
  • package nlp
    Definition Classes
    parsing
  • package units

    Provides classes representing information units (structured and textual units) and meta information (like tf vectors and type mentions).

    Provides classes representing information units (structured and textual units) and meta information (like tf vectors and type mentions).

    Overview

    Information units (implementing the trait InformationUnit) represent paragraphs in a given document, which can be narrative text (NaturalLanguageTaggedUnit) or structured fragments (CodeTaggedUnit). Each information unit exports a set of meta-information (implementing the MetaInformation trait), which are ready made semantic data for simple analyses. This version of StORMeD provides the following meta-information:

    In the case a meta information is not provided for a unit, a AbsentMetaInformation object can be also provided.

    Tutorial

    Suppose you want to get all the types mentioned in a question. First, you retrieve all its information units:

    scala> val questionUnits = question.units
    questionUnits: Seq[ch.usi.inf.reveal.parsing.units.InformationUnit] = ...

    Instead of using the visitor to on all the HASTs, you can exploit the ready made data provided by the meta information. For example, to get all the meta information for units, you can use flatMap:

    scala> import ch.usi.inf.reveal.parsing.units._
    import ch.usi.inf.reveal.parsing.units._
    
    scala> val questionMetaInfos = questionUnits.flatMap { _.metaInformation }
    questionMetaInfos: Seq[ch.usi.inf.reveal.parsing.units.MetaInformation] = List(...)

    You need now to filter to get only the CodeTypesMetaInformation, and then you can get, for example, the mentioned qualified Types:

    scala> val codeTypesMetaInfos = questionMetaInfos.filter { _.isInstanceOf[CodeTypesMetaInformation] }.asInstanceOf[Seq[CodeTypesMetaInformation]]
    codeTypesMetaInfos: Seq[ch.usi.inf.reveal.parsing.units.CodeTypesMetaInformation] = List(...)
    
    scala> val types = codeTypesMetaInfos.flatMap { _.qualifiedTypes }.distinct
    types: Seq[ch.usi.inf.reveal.parsing.model.java.ReferenceTypeNode] = List(...)
    Definition Classes
    parsing

package artifact

Provides classes for dealing with Stack Overflow artifacts.

It also provides means to serialize and deserialize them from JSON files.

Tutorial

The main class to consider is ch.usi.inf.reveal.parsing.artifact.StackOverflowArtifact, representing a Stack Overflow discussion.

The main way to obtain a Stack Overflow discussion is to deserialize it froma JSON file, using ArtifactSerializer.

import ch.usi.inf.reveal.parsing.artifact._
val jsonFilePath = /*path to json file*/
val serializer = new ArtifactSerializer()
val discussion = serializer.deserializeFromFile(jsonFilePath)

You can now inspect the discussion, for example by getting the question (a StackOverflowQuestion object) or answers (a seq of StackOverflowAnswer):

scala> val question = discussion.question
question: ch.usi.inf.reveal.parsing.artifact.StackOverflowQuestion = ...

scala> val answers = discussion.answers
answers: Seq[ch.usi.inf.reveal.parsing.artifact.StackOverflowAnswer] = ...

The way to get the object representing the poster of the question (a StackOverflowUser) is the following:

scala> val posterUser = question.owner
posterUser: Option[ch.usi.inf.reveal.parsing.artifact.StackOverflowUser] = Some(StackOverflowUser(...))

You can also easily get the comments (objects of StackOverflowComment), for example the ones posted for the question:

scala> val questionComments = question.comments
questionComments: Seq[ch.usi.inf.reveal.parsing.artifact.StackOverflowComment] = ...

The easy way to get all information units (ch.usi.inf.reveal.parsing.units.InformationUnit) and the HASTs (classes implementing the ch.usi.inf.reveal.parsing.model.HASTNode trait) for an element (question or answer or comment) is:

scala> val allUnits = discussion.units
allUnits: Seq[ch.usi.inf.reveal.parsing.units.InformationUnit] = List(...)

scala> val astNodes = allUnits.map { _.astNode }
astNodes: Seq[ch.usi.inf.reveal.parsing.model.ASTNode] =

For further information about the information units, the HAST and the model, please check the documentation of ch.usi.inf.reveal.parsing.model and ch.usi.inf.reveal.parsing.units.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. artifact
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. trait Artifact extends JsonSerializable with Product

    A software artifact, composed of information units.

  2. trait SourceInfo extends JsonSerializable

    A source info represents a binding between the information unit and the original contents

  3. case class StackOverflowAnswer (id: Int, questionId: Int, comments: Seq[StackOverflowComment], creationDate: Date, communityOwnedDate: Option[Date], lastActivityDate: Date, lastEditDate: Option[Date], score: Int, isAccepted: Boolean, owner: Option[StackOverflowUser], informationUnits: Seq[InformationUnit]) extends StackOverflowPost with Product with Serializable

    A StackOverflow answer in a discussion.

    A StackOverflow answer in a discussion.

    id

    the answer id, as from the Stack Overflow database.

    questionId

    the id of the question this answer was a reply of.

    comments

    the set of comments posted to this answer.

    creationDate

    the date this answer was posted.

    lastActivityDate

    the last date this answer had an activity.

    lastEditDate

    the last date this answer was edited.

    score

    the score of this answer.

    isAccepted

    true if this was the accepted answer in the discussion.

    owner

    the user who posted the answer, if it is present in the Stack Overflow dump.

    informationUnits

    the sequence of information units for this answer.

    See also

    SourceInfo

    StackOverflowUser

    StackOverflowComment

    StackOverflowQuestion

    StackOverflowArtifact

    Stack Overflow Answer API documentation

  4. case class StackOverflowArtifact (question: StackOverflowQuestion, answers: Seq[StackOverflowAnswer]) extends Artifact with Product with Serializable

    Represents a Stack Overflow artifact, i.e., a discussion.

    Represents a Stack Overflow artifact, i.e., a discussion.

    question

    the discussion's question.

    answers

    the discussion's answers.

  5. case class StackOverflowComment (id: Int, postId: Int, creationDate: Date, replyToUser: Option[StackOverflowUser], score: Int, isEdited: Boolean, owner: Option[StackOverflowUser], informationUnits: Seq[InformationUnit]) extends StackOverflowElement with Product with Serializable

    A Stack Overflow comment to a post.

    A Stack Overflow comment to a post.

    id

    the comment id, as from the dump.

    postId

    the id of the post (question or answer) this comment was posted to.

    creationDate

    the date this comment was created.

    replyToUser

    optionally, the user on which this comment was a reply for.

    score

    the score of this comment.

    isEdited

    if the score has ever been edited.

    owner

    the poster of this comment, if present.

    See also

    Stack Overflow Comment API documentation

  6. trait StackOverflowElement extends JsonSerializable

    A Stack Overflow element, that is, a question, answer or comment.

  7. trait StackOverflowPost extends StackOverflowElement

    A Stack Overflow post, that is, a question or an answer.

  8. case class StackOverflowQuestion (id: Int, title: String, comments: Seq[StackOverflowComment], tags: Seq[String], creationDate: Date, lastActivityDate: Date, lastEditDate: Option[Date], communityOwnedDate: Option[Date], closedDate: Option[Date], closedReason: Option[String], score: Int, viewCount: Int, owner: Option[StackOverflowUser], informationUnits: Seq[InformationUnit]) extends StackOverflowPost with Product with Serializable

    A StackOverflow question in a discussion.

    A StackOverflow question in a discussion.

    id

    the question id, as from the Stack Overflow database.

    comments

    the set of comments posted to this question.

    tags

    a set of strings representing the tags for this question

    creationDate

    the date this question was posted.

    lastActivityDate

    the last date this question had an activity.

    lastEditDate

    the last date this answer was edited, if it ever was.

    communityOwnedDate

    the date on which this question was owned by the community, if present.

    closedDate

    the date on which this question was closed, if it ever was.

    closedReason

    a string representing the reason for which the question was closed, if defined.

    score

    the score of this question.

    viewCount

    the number of times this question was viewed.

    owner

    the user who posted the question, if it is present in the Stack Overflow dump.

    See also

    SourceInfo

    StackOverflowUser

    StackOverflowComment

    StackOverflowAnswer

    StackOverflowArtifact

    Stack Overflow Question API documentation

  9. case class StackOverflowUser (id: Int, acceptRate: Option[Int], displayName: String, link: Option[String], profileImage: Option[String], reputation: Int, userType: Option[String]) extends JsonSerializable with Product with Serializable

    A Stack Overflow user as present in the dump.

    A Stack Overflow user as present in the dump.

    id

    the id of the user as present in the dump.

    acceptRate

    the rate of accepted answers for this user.

    displayName

    the display name for the user.

    link

    an optional URL for the user page.

    profileImage

    an optional profile image URL for the user

    reputation

    the user reputation.

    userType

    an optional string representing the user type (one of unregistered, registered, moderator, or does_not_exist, if defined).

    See also

    Stack Overflow User API documentation

  10. case class XmlSourceInfo (node: XmlElementNode) extends SourceInfo with Product with Serializable

    A xml source info is a binding between a xml/html element representing an information unit and the information unit itself.

    A xml source info is a binding between a xml/html element representing an information unit and the information unit itself.

    node

    the original xml element representing the source for this information unit.

    See also

    Stack Overflow Post API documentation

Value Members

  1. object ArtifactSerializer

    Serializer/Deserializer utility for artifacts.

    Serializer/Deserializer utility for artifacts.

    This version works only for StackOverflowArtifact.

  2. object DateSerializer extends CustomSerializer[Date]

    A custom JSON serializer/deserializer for java.util.Date.

    A custom JSON serializer/deserializer for java.util.Date.

    Serializes (deserializes) dates as (from) JSON strings with the format "yyyy-MM-dd".

Inherited from AnyRef

Inherited from Any

Ungrouped